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ABSTRACT 


This  report  reviews  the  current  state-of-the-art  in  three-dimensional 
display  technology.  The  basic  perceptual  cues  used  to  perceive  the  third 
(depth)  dimension  are  first  described,  and  empirical  data  bearing  on  the 
Interaction  between  these  cues  are  discussed.  Generally,  when  more  depth  cues 
are  present,  a  proportionately  more  salient  sense  of  depth  is  conveyed.  But 
this  additive  model  breaks  down  when  motion  is  Involved.  It  is  concluded  that 
stereopsis,  motion,  and  occlusion  are  particularly  salient  cues.  Techniques 
for  implementing  perspective  and  stereoptic  displays  are  then  described.  This 
discussion  is  followed  by  a  review  of  3D  display  technology  applications  in 
the  following  areas;  flight  deck  displays,  air  traffic  control,  meteorology, 
teleoperation,  and  computer  graphics.  Where  available,  studies  are  discussed 
which  contrast  the  efficacy  of  3D  with  2D  representations.  In  both  laboratory 
and  field  studies,  it  appears  that  the  usefulness  of  stereopsis  is  diminished 
and  may  vanish  altogether  when  displays  are  dynamic. 


THREE-DIMENSIONAL  DISPLAYS: 

PERCEPTION,  IMPLEMENTATION,  AND  APPLICATIONS 

Christopher  D.  Wickens,  Steven  Todd,  and  Karen  Seidler 

CSERIAC  REPORT 

1.0  INTRODUCTION 

Emerging  technology  has  created  a  number  of  new  opportunities  for 
display  design.  Two  technological  forces  in  particular  have  allowed  the 
development  of  three-dimensional  (3D)  displays  in  a  variety  of  crew  station 
systems.  Increases  in  the  speed  of  computer  graphics  software  and  hardware, 
coupled  with  advances  in  dynamic  stereoscopic  imaging  technology,  have  enabled 
the  design  of  displays  which,  by  more  closely  resembling  the  domain  of  objects 
and  events  they  are  meant  to  depict,  provide  more  "natural"  viewing 
conditions.  This  greater  naturalism  results  from  the  direct  coding  of 
distances  along  the  line  of  sight  into  the  display,  which  is  accomplished  by 
using  various  aspects  of  display  technology.  We  refer  to  this  distance 
information  as  the  "depth  axis"  or  z-axis  of  a  display.  Our  definition  of  the 
term  "3D"  encompasses  the  use  of  any  technique,  whether  stereoscopy  or  any  of 
the  cues  that  artists  build  into  a  perspective  painting,  to  create  a  sense  of 
depth  along  the  z-axis. 

Examples  of  3D  display  applications  abound  and  will  be  described  in 
considerable  detail  in  later  sections  of  tills  report.  Representative  examples 
can  be  drawn  from  the  field  of  aviation:  the  pilot's  map  of  the  approach  path 
or  rendezvous  point;  the  display  of  aircraft  locations  in  the  airspace  used  by 
air  traffic  controllers;  displays  of  geographical  and  topographical  features 
used  by  helicopter  pilots.  These  3D  displays  all  serve  to  provide  the  viewer 
with  a  better  understanding  of  physical  location  and  situation  awareness. 
Displays  of  this  sort  are  beneficial  outside  aviation  as  well.  The 
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meteorologist  can  gain  a  better  understanding  of  an  evolving  weather  pattern 
by  viewing  a  3D  representation  of  air  patterns;  operators  of  remotely 
manipulated  vehicles,  or  teleoperator  (remote  manipulator)  systems  will 
benefit  from  the  precise  depth  representation  of  the  environment;  human 
factors  designers,  using  3D  display;;  along  with  the  anthropometric  model  of  a 
workspace,  can  better  envision  how  the  intended  worker  will  function  there, 
and  the  constraints  the  space  will  impose  on  movement.  In  the  preceding 
examples,  depth  dimensions  conveyed  by  display  techniques  generally  correspond 
directly  with  depth  or  distance  in  the  physical  world.  Such  display 
techniques  have  also  been  used  to  represent  nonspatial  characteristics.  One 
example  might  be  the  use  of  overlaid  "windows"  In  computer  systems  like  the 
Apple  Macintosh.  Here  distance  has  ordinal,  not  metric  properties,  and  is 
used  to  convey  a  sense  of  priority  or  recency.  A  common  example  is  the  use  of 
the  depth  axis  in  3D  graphics  to  represent  mathematical  relationships 
involving  3  (or  more)  variables.  Great  strides  in  this  area  have  been  made 
recently  in  the  field  known  as  "scientific  visualization,"  (McCormick  et  al., 
1987)  in  which  complex  equations  can  be  generated,  and  3D  graphics  images 
rotated  and  "explored"  in  a  way  that  allows  a  better  appreciation  of  the 
relation  between  variables. 

Two  basic  human  factors  arguments  may  be  made  for  the  implementation  of 
3D  displays;  (1)  The  visual  scene  of  a  3D  world  is  a  more  "natural," 
"ecological,"  or  "compatible"  representation  than  that  provided  by  2D 
displays;  and,  closely  related,  (2)  A  single  Integrated  representation  of  one 
object,  relation,  or  scene  reduces  the  need  for  a  mental  integration  of  two  or 
three  representations. 

Despite  the  intuitive  appeal  of  the  preceding  arguments  and  the  esthetic 
appeal  of  most  3D  renderings,  the  advisability  of  implementing  3D  displays  is 
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not  a  foregone  conclusion.  There,  are  three  potential  costs  or  risks 
associated  with  this  display  technology. 

(1)  As  Gregory  (1977)  has  so  eloquently  argued,  any  representation  of  a 
3D  world  on  a  2D  image  surface  produces  an  inherent  ambiguity.  The  absolute 
distance  represented  by  a  point  along  the  line  of  sight  cannot  be  ascertained 
with  high  accuracy,  compared  to  absolute  distances  parallel  with  the  vie;ing 
image  plane  (the  plane  orthogonal  to  the  line  of  sight).  (This  limitation  is 
also  characteristic  of  direct  viewing.)  Th.us ,  3D  displays  create  the 
potential  for  perceptual  ambiguity. 

(2)  Somewhat  related  to  point  1,  the  integration  of  all  3  dimensions  of 
space  into  a  single  3-diraensional  object  may  result  in  reduced  precision  in 
reading  values  along  any  one  particular  axis  (Carswell  &  Wickens,  1987). 

Tlius,  the  improved  holistic  awareness  of  space,  gained  by  3D  representation, 
may  be  gained  at  the  expense  of  analytic  detail. 

(3)  3D  displays  usually  bring  with  them  an  added  set  of  design  issues, 
such  as  establishing  the  optimum  field  of  view  and  viewing  angle,  along  with 
technological  hardware  issues  (related  for  example  to  glasses  or  image 
generation  in  stereoscopic  displays),  which  considerably  complicate  the 
display  designer's  task.  These  issues  will  be  discussed  in  Section  4  of  this 
report . 

The  preceding  suggests  that  the  advantages  gained  by  3D  technology  are 
probably  somewhat  task  specific  and  that  tasks  in  which  a  holistic  awareness 
is  critical  may  be  facilitated  by  the  use  of  3D  technology.  Whatever  the 
task,  it  is  true  that  the  sense  of  depth  we  gain  from  a  display  depends  upon  a 
number  of  depth  cues  that  can  be  incorporated  to  simulate  a  sense  of  "natural" 
3D  world  viewing.  In  the  following  report.  Section  2  describes  the  nature  of 
those  cues  from  the  standpoint  of  basic  human  perception.  Section  3  discusses 
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how  depth  cues  work  in  combination  and,  therefore,  how  the  display  designer 
can  create  the  strongest  sense  of  depth  by  choosing  appropriate  cues.  Section 
4  addresses  some  issues  in  going  from  the  perceptual  description  of  the  cues 
to  their  actual  display  implementation.  Section  5  presents  a  review  of 
studies  that  have  examined  and  evaluated  3D  display  technology  in  6  main 
application  areas:  aviation,  air  traffic  control,  graphics,  geosciences, 
computer-aided  design,  and  medical  imaging.  It  should  be  noted  that  much  of 
the  applied  design  research  in  Section  5  has  not  appeared  in  peer-reviewed 
scientific  journals.  Effort  has  been  made  by  the  present  authors  to  establish 
the  validity  of  those  reports  and  proceedings  papers  included,  which  have  not 
received  this  review. 
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2.0  BASIC  CUES  FOR  DEPTH  PERCEPTION 


The  designer  of  any  3D  display  seeks  to  create  a  compelling  and  accurate 
sense  of  three-dimensionality,  signaling  to  the  viewer  such  information  as  an 
object's  relative  and  absolute  distance,  its  3D  shape,  and  its  orientation  or 
the  "slant"  of  its  surfaces.  In  the  current  section  we  shall  first  describe  a 
number  of  cues  that  the  human  perceptual  system  uses  to  extract  this 
information  from  the  visual  scene,  along  with  various  constraints  to  their 
application.  With  appropriate  display  technology  any  and  all  of  these  cues 
can  be  incorporated  into  synthetic  displays.  Then  in  Section  3,  we  discuss 
empirical  data  from  perceptual  experiments  that  have  examined  the  strength  or 
salience  of  these  cues  in  creating  a  compelling  sense  of  depth.  This  is 
critical  information  for  the  display  designer,  who  must  weigh  the  advantages 
of  more  or  better  depth  perception  cues  against  the  expense  of  the  extra 
programming  or  more  sophisticated  (and  therefore  less  reliable)  technology 
necessary  to  provide  them.  Hence,  the  designer  should  have  information 
regarding  the  most  compelling  cues. 

The  phenomenon  of  depth  perception  may  be  delineated  into  three  parts: 
judgments  of  absolute  distance,  judgments  of  relative  distance,  and  the 
perception  of  an  object  Itself  as  three  dimensional.  A  judgment  of  absolute 
distance  is  paraphrased  as  "How  far  away  is  that  object  from  my  point  of 
view?";  relative  distance  as  "How  far  apart  are  those  objects  from  each 
other?";  and  object  perception  as  "What  is  the  true  3D  shape  of  that  object?" 

While  the  objective  world  consists  of  3D  objects  with  weight  and  volume, 
our  visual  perception  of  this  world  is  built  upon  the  two-dimensional  images 
that  fall  upon  our  retinas.  In  order  to  transform  these  images  into  a 
perception  with  depth  we  utilize  a  number  of  "depth  cues."  These  depth  cues 
result  from  consistent  patterns  seen  in  the  two-dimensional  images  that  depict 
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the  3D  world  ard  from  the  physical  structure  of  the  human  visual  system. 

These  depth  cues  may  be  represented  in  the  objective  stimulus  present  on  Che 
retina  or  by  the  state  of  the  visual  system- -its  optics  and  its  muscles.  The 
former  are  sometimes  referred  to  as  pictorial  or  world- centered  cues.  The 
latter  are  referred  to  as  observer -centered  cues. 

Depth  cues  may  be  classified  under  the  effects  of  light,  occlusion, 
object  size,  ’leight  in  the  visual  field,  the  effects  of  movement,  muscular 
sensations,  and  binocular  viewing.  The  following  list  of  depth  cues  is 
organized  according  to  this  categorization.  Each  cue  description  is  followed, 
where  applicable,  by  a  list  of  constraints  that  would  limit  the  effectiveness 
of  the  cue  as  it  is  incorporated  into  display  design.  Visual  interpretation 
of  these  cues  in  an  aviation  context  can  be  facilitated  by  reference  to  Figure 
2.1.  Descriptions  of  Figure  2.1  in  the  text  below  refer  by  number  to  specific 
cues  in  the  figure. 

2.1  Light 

2.1.1  Luminance /bri ghtness  e f fects .  Objects,  parts  of  objects,  or 
simply  regions  may  be  perceived  to  be  at  different  depths  as  a  result  of 
differences  in  luminance  emanating  from  the  various  parts  of  the  object  (also 
see  2.1.3,  Shadows  and  highlights,  below). 

Egusa  (1977,  1983)  examined  the  perception  of  two  adjacent  regions  when 
the  two  regions  were  shaded  with  combinations  of  achromatic  hues:  black, 
three  shades  of  gray,  and  white.  He  found  that  as  the  brightness  difference 
between  the  regions  increased,  the  amount  of  depth  perceived  separating  the 
regions  increased.  Unfortunately,  the  direction  of  this  perception  differed 
bot.veen  su:  ects.  There  was  no  consistent  trend  for  the  darker  shades  to 
signal  e:  „h.'r  closer  or  more  distant  regions. 


9 


scraclng  various  cues  for  depth  perception.  These  cues 


Dosher,  Sperling,  and  Wurst  (1986)  and  Schwaruz  and  Sperling  (1983) 
tested  the  hypothesis  that  variation  in  brightness  induces  a  sensation  of 
depth  by  assuming  that  the  brighter  part  of  a  computer-generated,  luminous, 
wire-frame  object  would  be  perceived  to  be  in  the  foreground.  They  found  that 
th.e  brighter  parts  of  an  object  appeared  closer  to  the  observer  than  the 
dimmer  parts.  Dosher  et  al.  (1986)  define  this  cue  as  Proximity  Li.iminance 
Covarien'' e  ( PLC)  (see  also  Section  3.6). 

Figure  2.1;  //I:  The  difference  in  luminance  along  the  runway  serves  as 

a  depth  cue.  The  foreground  is  brighter  and  more  clearly  def tried. 

Constraints : 

*Using  brightness  lo  code  discrete  levels  of  depth  appears  to  he  ambiguous 
regarding  the  observer's  perception  of  the  object's  closeness. 

*Provides  relative ,  but  noc  absolute ,  distance  information . 

2.1.2  Aerial  perspective .  Dcsaturation  and/or  the  addition  of  the 
environment's  ambient  hue  to  an  object's  color  can  affect  the  percei*red  depth 
of  an  object,  particularly  of  relatively  distant  objects.  Desaturation  of  an 
object's  objective  color  is  caused  by  the  atmospheric  scattering  of  light, 
(resulting  in  a  grayer  color).  The  addition  of  the  ambient  hue,  usually  blue, 
is  again  a  r''svlt  of  the  Increased  atmosphere  present  between  the  observer  and 
the  perceived  object.  Hence,  desaturation  and  a  bluish  huv  would  convey 
information  that  objects  are  at  greater  distance. 

Figure  2.1;  //2;  The  mountains  in  the  background  would  show  these 

effects . 

Constraints ; 

*Provides  relative  but  not  absolute  distance  information. 

2.1.3  Shadows  and  highll  ghts .  The  perceived  depth  of  an  obj(!ct  (the 
surface  of  an  object,  or  the  surface  upon  which  the  object  casts  its  shadov;) 
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may  be  affected  by  the  presence  of  shadows  or  highlighting  on  the  object 
itself  or  upon  surfaces  contiguous  with  the  object. 

A  shadow  may  be  attached  (cast  by  and  falling  upon  the  object  itself)  or 
cast  (falling  off  the  object  onto  a  background).  As  a  depth  cue,  an  attached 
shadow  shows  the  characteristics  and  shape  of  an  object's  surface, 
particularly  by  indicating  whether  a  given  area  is  extended  or  indented  from 
the  surrounding  surface  (Weintraub  L  Walker,  1968;  Cavanagh,  1987). 

Perception  of  a  two-dimensional  shape  (e.g.,  an  ellipse)  as  a  3D  object  (an 
ellipsoid)  is  created  with  the  use  of  attached  shadows  (Todd,  1985).  A  cast 
shadow  influences  perception  of  the  surface  upon  which  an  object  casts  its 
.shadow,  as  well  as  the  judgment  of  distance  of  an  object  (Rock,  Wheeler, 

Shallo,  &  Rotunda,  1982). 

Figure  2.1:  //3:  Attached  shadows  are  shown  on  the  sphere  and  right 

wall  of  the  first  control  tower,  and  on  the  lower  right  of  the  runway.  The 
latter  shows  an  Indentation  in  the  surface  of  the  runway. 

Figure  2,1:  Cast  shadows  are  shown  near  both  cowers.  The  cast 

shadow  of  the  closer  tower  indicates  a  bump  in  the  surface  of  the  runway.  The 
angle  between  the  two  shadows  indicates  the  relative  positioning  of  the  light 
source  and  the  two  towers  in  3D  space.  To  the  extent  that  this  angle  is 
greater  than  zero,  the  towers  are  farther  apart  relative  to  their  distance 
from  the  source. 

A  light  source  may  be  specular  (l.e.,  roughly  parallel  rays  of  light 
from  one  source)  and  vary  in  intensity  and  position,  or  it  may  be  diffuse .  A 
hi.rhl '.  frln'  is  a  strong  "spot"  of  reflected  light  from  a  specular  source.  The 
location  of  a  highlight  on  an  object's  surface  is  dependent  on  the  position  of 
both  the  light  source  and  the  observer.  If  the  object  or  observer  moves,  the 
location  of  the  highlight  will  shift  across  the  surface  of  the  object 
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(Kennedy,  1974).  A  highlight  will  move  in  the  same  direction  as  the  moving 
light  source  or  observer  across  a  convex  surface,  in  th®  opposite  direction 
across  a  concave  surface.  Its  intensity  and  the  qualities  of  the  object's 
surface  determine  the  kind  of  highlight  that  will  be  perceived  (e.g.,  a  simple 
bright  spot,  a  bright  spot  encircled  with  a  diffuse  coma).  Diffuse 
illumination  generally  does  not  produce  shadows,  though  it  allows  perception 
of  the  entire  face  of  the  object,  and  Its  contour,  against  the  background. 

Figure  2.1:  A  highlight  is  prcstrt  on  the  sphere  of  the  first 

tower.  As  the  observer  moves  past  the  tower  (e.g.,  flying  over  the  runway) 
this  highlight  will  move  around  the  sphere  in  the  observer's  direction. 
Constraints : 

*Light ,  as  a  specular  source,  is  assumed  Co  come  from  above,  not  from  Che  side 
or  bottom  of  an  object  (Berbaum,  Tharp,  &  Mroczek,  1981:  Haber,  1980).  This 
influences  perception  of:  i)  attached  shadows:  In  general,  if  an  area's 
lower  half  is  shaded,  this  area  is  seen  as  extending  forward;  if  an  area's 
upper  half  is  shaded,  Che  area  is  seen  as  indented  (Rock,  1975);  ii) 
highlights :  Highlights  produced  by  light  from  above  induce  greater  depth 
effects  than  highlight  from  a  side  source  (Berbaum  et  al .  ,  1983). 

*Tbe  functional  use  of  cast  shadows  as  a  relative  depth  cue  depends  on  a 
specular  light  source  located  relatively  close  to  the  objects  (e.g.,  a  flare 
or  spotlight .  rather  than  the  sun).  This  is  necessary  to  cause  a 
distinguishable  difference  between  the  angles  of  "he  objects  end  their 
respective  cast  shadows  (Rock  et  al ■ ,  1982).  Otherwise,  cast  shadows  only 
provide  information  of  the  immediate  area  (Rock,  1975). 

*The  qualities  of  the  object's  surface  (shiny  or  dull)  will  affect  the  amount 
cf  light  reflected  from  the  object  (and  therefore  its  .shading)  .  A  curved 
surface  is  perceived  as  more  curved  if  Its  surface  is  sniny  rather  than  dull 
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(however ,  Judgments  concerning  the  degree  of  curvature  are  not  more  accurate) 
(Todd  S  hingolla,  1983). 

2.1.4  Color.  Differences  in  color*-that  is,  hue,  saturation  and 
brightness-  affect  an  object's  perceived  depth. 

{Also  see  Aerial  Perspective,  2.1.2  above] 

Egusa  (1983)  examined  the  effects  of  brightness  [see  2.1.1  above],  hue, 
and  saturation  on  perceived  depth.  Concerning  hue,  Egusa  compared  depth 
perception  of  two  adjacent  regions.  In  one  condition  achromatic  hues  (black, 
gray,  or  white)  were  compared  with  chromatic  hues  (red,  blue,  or  green).  In 
another  condition,  chromatic  hues  were  compared  with  other  chromatic  hues. 

Over  both  conditions,  red  was  judged  to  be  closest  to  the  observer  followed  by 
green,  and  then  blue.  Concerning  saturation,  as  the  difference  in  saturation 
levels  between  the  two  regions  increased,  so  did  the  perceived  distance 
separating  the  regions.  However,  the  direction  of  perceived  depth  between 
saturated  and  desaturated  levels  differed  across  hues. 

Egusa  hypothesized  that  the  usefulness  of  color  as  a  depth  cue  arises 
from  the  chromatic  aberration  of  light.  As  different  wavelengths  of  light 
pass  through  the  lens  of  the  eye,  these  wavelengths  bend  and  leave  the  lens  at 
different  angles  (much  as  a  prism  creates  a  "rainbow”).  This  in  turn  causes 
different  focal  points  within  the  eye  for  the  different  wavelengths  of  light 
(and  therefore  a  need  for  differential  accommodation  of  the  lens).  This  is 
called  chromatic  aberration. 

Constraints : 

*Ainblgulty  of  saturation  differences . 

*Does  not  provide  absolute  depth  Information. 

*Color  Is  salient,  and  by  color  ceding  different  objects  or  depth  planes  this 
may  signal  unwanted  and  Irrelevant  Information.  For  example,  the  choice  of 
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red  to  signal  "close,"  may  also  be  perceived  as  "danger." 

^Ineffective  and/or  misleading  for  7X  of  the  male  population  whose  vision  is 
color  deficient . 

*Unreliable  in  varying  conditions  of  ambient  illumination. 

2.1.5  Texture  f radlents .  The  perceived  depth  of  a  surface  (for 
example,  the  earth's  terrain) - -whether  flat,  slanted,  or  curved--ls  affected 
by  its  texture. 

Most  objects  perceived  visually  by  human  observers  contain  textured 
patterns  across  their  surfaces.  This  texture  is  defined  by  the  size  and 
spacing  of  the  elementary  features  of  which  it  is  composed.  This  surface, 
when  viewed  at  an  angle  other  than  0°  or  90°  results  in  a  texture  gradient  A 

typical  example  is  the  viewing  of  a  large  field  of  grass.  The  elementary  unit 
is  a  blade  of  grass.  The  "gradient"  is  the  change  of  texture  perceived  as  one 
looks  from  one's  feet  up  to  the  horizon.  When  texture  gradients  are  used  to 
convey  a  sense  of  depth,  the  viewer  assumes  that  the  elementary  texture  units 
are  of  roughly  the  same  size  and  are  approximately  equally  spaced  across  the 
surface  (Cutting  6  Millard,  198A;  see  also  Braunstein.  1976). 

Cutting  and  Millard  distinguished  between  three  static  gradients 
concerning  depth  perception:  i)  perspective  gradient;  il)  compression 
gradient;  and  ill)  density  gradient.  Couched  in  an  analogy  of  viewing  a 
hallway  with  a  tiled  floor: 

1)  Perspective  gradient  is  measured  by  the  change  in  the  x-axis  width  of 
an  elementary  feature  (e.g.,  a  tile). 

ii)  Compression  gradient  is  the  ratio  of  y/x  axes  measures  of  the 
elementary  feature  (Figure  2.1:  0(>)  . 
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iii)  Density  gradient  Is  measured  by  the  number  of  tiles  per  unit  of 
visual  angle  (Figure  2.1:  #7). 

Linear  perspective  is  a  special  case  of  a  perspective  texture  gradient 
when  the  elements  involved  are  all  of  objectively  equal  sizes  and  objectively 
consistent  density.  A  particularly  strong  effect  occurs  when  these  elements 
form  parallel  lines  on  the  actual  surface  that  is  viewed;  railroad  tracks  are 
a  typical  example.  (Figure  2.1:  y/8:  Linear  perspective  of  the  parallel 

sides  of  the  runway.) 

The  perception  of  slant  of  a  fi  t  saifcce  is  guided  largely  by  the 
perspective  gradient  (65Z)  ,  to  a  lesser  degree  by  the  density  gradient  (28X) , 
and  least  by  the  compression  gradient  (6X).  The  perception  of  curvature  in  a 
surface  is  overwhelmingly  guided  by  the  compression  gradient  (962) ,  with  the 
perspective  and  density  gradient.?  providing  very  weak  influences  (22  apiece) 
(Cutting  6c  Millard,  1984).  when  linear  perspective  is  used  in  the 
construction  of  3D  objects  or  scenes,  these  are  described  as  using  polar 
orolection.  When  it  is  not,  a  parallel  projection  is  created  (see  Figure 
2.2)  . 

Constraints: 

*Perspecclve  works  best  for  regularly  textured  surfaces  and  poorly  for  random 
dots  or  random  shapes  of  irregular  size. 

*SubJecCs  estimating  the  degree  of  slant  from  a  photograph  systematically 
underestimate  the  objective  slants  of  the  surfaces  (Gibson,  1950).  That  is, 
they  estimate  the  surface  to  be  more  closely  orthogonal  to  the  line  of 
sight  than  is  the  case.  This  effect  is  significantly  greater  for  irregular 
surface  textures  than  for  regular  surface  patterns. 
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Figure  2.2  Examples  of  a  wireframe  cube  in  parallel  projection  (left)  or 
polar  projection  (right). 
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2 . 2  Occlusion  or  Interposition 

The  perceived  depth  of  objects  (or  parts  of  objects)  is  affected  by  the 
apparent  interposition  of  these  objects  relative  to  the  observer's  viewpoint. 

The  occlusion/interposition  cue  results  from  the  perceptual  organization 
of  the  objective  image  by  the  observer.  An  assumption  is  made  that  the  more 
distant  object  does  indeed  continue  behind  the  occluding  object.  Hence,  more 
familiar  objects  increase  the  effectiveness  of  occlusion  (Schiffman,  1982; 
Itcelson  &  Kilpatrick,  1952). 

Figure  2.1:  //9;  Occlusion  of  the  mountains  and  the  rightmost  jet. 

Constraints : 

*Occlusion  is  necessarily  an  ordinal  and  relative  depth  cue.  For  example,  the 
use  of  occlusion  to  represent  a  rotating  sphere  allows  for  perception  of  only 
ordinal  depth  (i.e.,  which  surface  of  the  sphere  is  closer),  and  does  not 
assist  in  the  perception  of  the  object's  shape/depth  in  3D  space  (Andersen  & 
Braunstein,  1983). 

2 . 3  Object  Size 

The  perceived  size  of  an  object  can  serve  as  a  depth  perception  cue  in  a 
number  of  ways. 

2.3.1  Size-distance  invariance .  This  cue  concerns  the  perceived  depth 
of  objects  based  on  their  assumed  or  perceived  size,  and  the  size  of  the 
visual  angle  subtended  by  their  retinal  image .  Tliis  relation  may  be  expressed 
by  the  approximate  formula;  Size  -  Visual  Angle  x  Distance.  When  true  size 
is  known  (or  estimated) ,  then  the  expression  can  be  used  to  calculate  the 
distance  of  an  object  from  the  observer.  When  distance  is  known  (from  other 
cues)  then  the  formula  provides  the  means  for  deriving  a  perceived  size.  The 
relation  has  two  general  implications: 
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1)  Objects  of  a  greater  visual  angle  are  perceived  as  closer  than  those 
of  a  smaller  angle. 

Figure  2.1:  #10:  The  rightmost  building  appears  to  be  closer  than  the 

smaller  building  to  its  immediate  left. 

il)  Objects  of  the  same  visual  angle  are  perceived  to  be  the  same 
distance  away. 

Figure  2.1:  #11:  These  two  jets  are  the  same  objective  size  and  appear 

the  same  distance  away. 

ConsCrfflnLs: 

*Applicable  primarily  to  familiar  c’jjects  whose  true  sizes  are  known. 

2.3.2  Size  bv  occlusion.  Perceived  object  size  is  supported  by 
occlusion,  with  object  size  estimated  by  the  number  of  elementary  texture 
units  of  a  background  surface  occluded  by  that  object  (Gibson,  1966).  The 
perceived  distance  of  tno  object' then  beco/ies  a  function  of  the  visual  angle 
it  subtends  (slue*.  )'er'^:«lvcd  rizt  -  Visual  angle  x  Perceived  Distance). 

Figure  2.1  #lt •  The  upper  jet  occludes  a  greater  number  of  texture 
elements  and  is  therefore  perceived  as  more  distant. 

Constraints: 

*A  relative  distance  cue  is  reliable  only  if  texture  of  the  surface  is  uniform 
behind  all  objects. 

2.3.3  Familiar  size .  Familiarity  of  an  objoct  to  the  observer  can 
serve  as  a  cue  for  absolute  depth  pexcep'.lviii  by  influencing  perceived  object 
size . 

A  familiar  object  tends  to  maintain  a  constant  perceived  size,  no  matter 
what  its  objective  visual  angle  (Schiffroan,  1982).  The  perceived  distance  of 
that  object  then  becomes  a  function  of  the  visual  angle  it  subtends  (since 
Perceived  size  ••  Visual  angle  x  Perceived  Distance)  (Ono,  1969). 
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Figure  2.1:  #13:  The  five  jets  in  flight  are  assumed  to  be  all  of 
equal  size.  Since  the  leftmost  jet  subtends  the  smallest  visual  angle  of  the 
five,  it  will  be  perceived  as  the  farthest  away. 

Constraints: 

*PoCent:Lally  misleading  when  used  with  similarly  shaped  objects  that  might 
have  very  different  sizes  (e.g.,  runway  lengths,  aircraft). 

2.4  Height  in  the  Visual  Field 

The  vertical  position  of  objects  and  parts  of  objects  in  the  visual 
scene- -from  the  observer's  viewpoint- -can  act  as  a  depth  perception  cue.  The 
higher  an  object  is  in  the  visual  field,  the  farther  it  appears  to  lie  from 
the  observer  (Berbaum  et  al.,  1983).  This  cue  is  based  on  the  observer's 
assumption  that  the  "ground  plane"  upon  which  the  observer  stands  extends 
outward  horizontally  to  the  horizon  (Rock,  1975).  That  is,  in  a  typical 
visual  scene,  it  is  assumed  that  the  foreground  is  low  and  the  horizon  high. 
Conversely,  objects  that  are  horizontally  adjacent  will  be  perceived  as 
equidistant  (Schiffman,  1982;  see  also  Gogel,  1965). 

Figure  2.1:  #14:  These  two  jets  subtend  the  same  visual  angle,  but  the 

objectively  higher  jet  appears  farther  away  (therefore,  according  to  the 
size -distance  invariance  cue,  it  also  appears  to  be  larger). 

Figure  2.1:  #15:  The  jet  and  the  one  building  are  horizontally 

adjacent  and  will  appear  as  equidistant. 

Constraints: 

*Depends  on  the  "standard"  viewing  perspective:  looking  down  on  objects. 
Unreliable  and  misleading  when  viewing  objects  from  below  (i.e.,  as  in  viewing 
other  aircraft  in  the  airspace  above,  or  viewing  submarines  or  ships  from  a 
viewpoint  below). 
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2 . 5  Motion 


Sufficient  notion  for  the  perception  of  depth  may  result  from  any 
movement  of  the  observer  (in  entirety  or  Just  head  movement)  and/or  movement 
of  the  object(s).  This  motion  will  alter  the  relative  distance  and  the 
original  orientation  between  the  observer  and  the  object(s)  . 

2.5.1  Motion  perspective .  This  cue  concerns  the  depth  perception  of 
objects  from  motion.  Motion  perspective  allows  the  perception  of  the  relative 
distances,  velocities,  and  locations  as  well  as  the  direction  of  movement  of 
these  objects,  as  the  observer's  viewpoint  of  the  environment  changes.  The 
greater  the  objective  distance  between  the  objects,  the  greater  is  the 
shifting  of  their  images  relative  to  one  another  with  observer  movement. 

Figure  2.1:  (/16:  As  the  obseir^er  moves  forward,  over  the  runway,  the 
lower  building- -now  partly  obstructed  by  the  first  control  tower- -will  first 
disappear  and  then  become  visible.  The  relative  distance  and  position  of  the 
two  buildings  will  become  increasingly  clear. 

Motion  parallax  is  a  unique  case  of  motion  perspective.  It  describes 
the  effects  of  a  relative,  lateral  motion  between  the  observer  and  the 
object(s).  Objects  closer  to  the  moving  observer  are  usually  perceived  to  be 
moving  faster  jhan  those  objects  at  a  distance.  Thus  an  observer  may  move  his 
head  from  side  to  side  to  determine  which  of  two  objects  is  closer.  The 
apparent  direction  of  these  objects  and  their  perceived  speed,  however,  are  a 
function  of  the  observer's  fixation  point  (see  Figure  2.3)  (Schiffman,  1982). 

Rogers  and  Graham  (1979)  examined  motion  parallax  produced  by  either 
movement  of  the  display  or  movement  of  the  observer.  They  found  that  self- 
produced  movement  by  the  observer  resulted  in  a  greater  sensation  of  depth. 
However,  with  larger  amounts  of  relative  movement,  the  amount  of  perceived 
depth  was  less  than  expected.  Rogers  and  Graham  also  emphasized  the 
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Figure  2.3.  Schematic  diagram  of  motion  parallax.  If  an  object  located  at  F 
is  fixated  while  the  observer  moves  to  the  left,  the  images  of  the  nearer 
objects  appear  to  move  to  the  right  whereas  farther  object.s  seem  to  move  to 
the  left.  The  length  of  the  arrows  indicates  tViat  the  apparent  velocity  of 
objects  in  the  field  of  view  increases  in  direct  relation  to  their  distance 
from  the  fixation  point  (based  on  Gibson,  1950),  from  Schiffman.  1982. 


importance  of  a  complex  display  with  many  objects  and  features  producing  a 
strong  sensation  of  depth 
Constraints : 

*0no ,  Rives t,  and  Ono  (1986)  examined  depth  perception  from  motion  parallax 
created  by  self -produced  observer  movement.  This  movement  activated  sensors 
positioned  on  the  head  which  drove  aspects  of  the  display.  As  viewing 
distancns  increased ,  observers  lost  the  perception  of  depth  and  saw  only  a 
flat,  two-dimensional  motion  of  the  objects  across  the  screen  of  the  CRT. 

Loss  of  perceived  depth  was  a  function  of  individual  differences ,  the  degree 
of  disparity  used,  and  the  viewing  distance . 

*Motion  parallax  is  primarily  a  relative  depth  cue  (Rogers  &  Graham,  1979; 
Farber  &  McConkie,  1979),  although  it  may  also  provide  some  absolute  distance 
information  (Landy,  1987). 

2.5.2  Object  perception.  This  cue  concerns  the  perception  of  depth 
which  describes  an  individual  object's  3D  structure  from  its  rotation.  This 
process  is  also  called  the  "recovery  of  structure  from  motion"  (Braunstein, 
1986). 

The  kinetic  depth  effect  (KDE)  describes  the  perception  of  a  3D  object 
from  a  2D  stimulus;  this  stimulus  is  the  flat  shadow  of  a  rot''lng  3D  wire¬ 
frame  object,  typically  a  wire  cube,  cast  upon  a  translucent  screen  before  an 
observer.  Rotation  of  the  object  is  around  the  X  or  Y  axis  of  the  screen 
(i.e.  ,  an  axis  orthogonal  to  the  Z  axis).  The  perception  Oi.  the  object  as  3D 
will  emerge  only  when  the  object  is  put  into  motion. 

Constraints: 

*The  shadow  of  the  rotating  object  may  be  projected  onto  the  screen  in 
parallel  or  polar  fashion  (see  Figure  2.2).  Braunstein  (1986)  states  that 
kinetic  depth  from  parallel  projection  "provides  information  about  object 
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shape  but  leaves  depth  order  ambiguous" ;  the  use  of  polar  projection 
introduces  a  perspective  gradient  and  allows  the  perception  of  depth  order. 

For  a  perception  of  depth  order  with  the  use  of  parallel  projection ,  other 
cues,  such  as  stereoptic  disparity,  must  be  included  (see  Richards ,  1985). 

Todd  (198^)  states  however  " .  .  .  it  is  yet  to  bs  determined  which  degree  of 
perspective  results  in  the  most  precise  perceptual  specification  of  an 
object's  3D  structure." 

*lntegration  of  the  changing  two-diiaensional  information  over  time  is 
necessary .  The  longer  the  viewing  time,  the  more  accurate  the  perception : 
shorter  viewing  times  often  lead  to  the  perception  of  flatter- than- correct 
objects  (Ullman,  1979;  Lappin  et  al .  ,  1980;  Creen,  1961;  White  &  Meuser, 

1960). 

*The  kinetic  depth  effect  provides  only  relative  depth  information  to  the 
observer;  chat  is,  the  3D  shape  of  the  object  in  space  is  conveyed,  but  not 
its  absolute  distance  from  Che  observer  (handy,  1987). 

2 . 6  Muscular  Sensations 

The  muscles  that  control  different  aspects  of  the  visual  system  piovide 
proprioceptive  information  that  can  sef'e  as  depth  perceptiosT  cues. 

2.6.1  Accommodation .  This  cue  concerns  the  depth  perception  of 
objects,  which  results  from  the  different  levels  of  adjustment  of  the  lens 
(accommodation)  necessary  to  bring  objects  of  different  objective  distances 
Into  focus  on  the  retina.  The  focal  length  of  the  lens  is  altered  by  changing 
the  shape  of  the  lens.  The  normal  range  of  accommodation  is  about  20  feet 
(6m)  to  about  4  in  (10.2  cm)  (Brown,  1965). 

i)  Viewing  a  scene  monocularly  through  an  "artificial  pupil"  e  iminates 
the  need  for  accommodation  of  the  lens  to  focus  on  objects  of  different 
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distances.  Objects  at  all  distances  will  be  in  focus  on  the  retina  (Rock, 
1975)  . 


ii)  Collimation  is  the  optical  technique  of  passing  light  rays  through  a 
lens  in  such  a  way  that  they  are  parallel  when  reaching  the  lens  of  the  eye. 
This  eliminates  the  need  for  accommodation. 

Constraints : 

*Accomniodat:ioii  is  affected  not  only  by  the  distance  of  the  observed  object  bui 
by  low  ambieiit  illumination  or  reduced  and  degraded  visual  stimulation  which 
occurs  when  pattern,  contour,  and  contrast  detail  are  absent  from  the  visual 
field  (lida,  1983;  Schiffman,  1982).  Under  these  cond  i  r  • '^ns ,  when  there  arc 
few  identifiable  contours  to  act  as  a  cue  fo"'  viie  appropriate  focus, 
accommodation  of  the  lens  deviates  from  an  accurate  focusing  mechanism  (and 
hence,  distance  cue)  towards  a  resting  state  (tee  also  Simonelli ,  1983). 

*  Accommodation  response  is  affected  by  an  ob.^erver's  nearsight  'dness  (myopia), 
farsightedness  (hyperopia) ,  and  age  (Simon tlli,  1983).  For  an  observer  with 
myopia,  accommodation  of  the  eyes  will  occur  or  "rest"  at  a  nearer  distance 
than  for  an  observer  without  myopia,  even  after  correction  with  artificial 
lenses  (glasses  or  contacts).  As  an  observer's  age  increases ,  the  resting 
pjint  of  accommodation  will  again  move  inward  towards  the  observer .  The 
location  of  an  observer's  resting  point  is  important  because  accommodation  is 
affected  by  both  this  distance  and  the  distance  of  the  object  being  observed 
(Simonelli,  1983). 

* Accommodation  is  affected  by  certain  emotional  states  (Schiffman,  1982). 

*When  both  accommodation  and  convergence  cues  (2.6.2)  respond  to  the  same 
stimulus  they  act  fairly  well  as  depth  cues  up  to  approximately  2  m  but  are  of 
limited  usefulness  beyond  this  point  (Wallach  &  Floor,  1971). 

2.6.2  Convergence .  This  cue  is  based  on  the  proprioceptive  feedback 
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from  the  muscles  that  rotate  the  eyes  inward  (cross-eyed)  or  outward  to  fuse 
the  image  of  an  object  on  proper,  corresponding  locations  of  the  two  retinas. 
The  degree  of  convergence  is  defined  by  the  angle  between  the  axes  of  the 
separate  eyes  and,  therefore,  is  inversely  proportional  to  distance.  When 
viewing  a  distant  object,  the  axes  are  nearly  parallel  and  convergence  is  low. 
When  viewing  a  close  object,  convergence  of  the  eyes  increases. 

Constraints : 

*When  boch  accommodation  and  convergence  cues  respond  to  the  same  stimulus 
they  act  fairly  well  es  depth  cues  up  to  approximately  2  m  but  are  of  limited 
usefulness  beyond  this  point  (Uallach  &  Floor,  1971). 

*Convergence  provides  less  accurate  information  with  monocular  viewing,  but  is 
still  available  to  some  extent  when  the  head  angle  is  fixed. 

\ 

I  2.7  Binocular  Effects 

These  cues  result  from  the  use  of  two  separated  eyes.  Because  of  this 
separation.,  the  two  retinas  receive  slightly  different  images  of  the  visual 
scene.  Convergence  of  the  two  eyes  is  necessarily  a  binocular  cue  (see  2.6 
Muscular  sensations  a^'ove). 

2.7.1  Convereence .  The  degree  of  convergence,  a  muscular  cue,  is  also  a 
binocular  cue,  and  is  described  in  Section  2.6.2. 

2.7.2  Disparity .  This  cue  concerns  the  reconciliation  of  noncoincident 
(disparate)  images  on  the  two  retinas  into  a  singular  image  of  an  object 
perceived  at  some  depth  from  the  observer. 

Assume  that  two  objects  of  equal  size  are  located  in  the  environment  and 
are  within  the  visual  field  (see  Figure  2.4).  The  relative  locations  of  these 
objects  on  the  left  and  right  rctiria  will  differ  due  to  the  separation  of  the 
two  eyes.  How  they  will  differ  depends  on  the  absolute  and  relative  depth 
distances  of  the  objects,  and  their  horizontal  separation.  The  perspectives 
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Figure  2.4.  Demonscraces  binocular  disparity.  Note  the  larger  differences 
between  the  left  and  right  view  when  the  distance  between  objects  is  great 
(top),  relative  to  when  it  is  not  (bottom)  (from  Rock,  1975). 


of  this  scene  for  the  different  eyes  are  shown  in  Figure  2. A.  Note  that  the 
horizontal  distance  between  the  two  objects  differs,  depending  on  the  eye's 
viewpoint.  If  this  disparity  across  the  images  of  the  separate  eyes  is  not 
too  great,  the  visual  system  will  "assume"  the  two  images  to  both  fully 
represent  the  visual  scene.  The  two  images  are  then  fused  to  form  the 
perception  of  one  visual  scene,  comprising  two  objects,  with  a  sense  of  depth. 
If,  however,  the  disparity  of  where  the  images  fall  on  the  two  retinas  is  too 
great,  double  Images  may  be  seen.  This  may  occur  in  two  ways,  through  crossed 
or  uncrossed  disparity.  Fixation  of  a  near  object  will  cause  the  projection 
of  a  distant  object's  image  to  fall  on  the  nasal  sides  of  both  retinas. 
Therefore  the  right  eye  will  see  the  far  object  as  displaced  to  the  right  of 
the  near  object,  the  left  eye  to  the  left.  This  condition  is  called  uncrossed 
disparity.  Likewise,  fixation  of  a  distant  object  will  cause  the  right  eye  to 
see  the  near  object  as  displaced  to  the  left,  the  left  eye  to  the  right  of  the 
fixated  object  (the  unfocused  image  falling  on  the  temporal  side  of  both 
retinas).  This  condition  is  called  crossed  disparity  (see  Figure  2.5). 

To  create  an  artificial  disparity,  five  methods  have  generally  been 
used;  mirrors,  lenses,  rapid  alternation  of  the  left  and  right  views  (time 
multiplexing),  colored  light  (an  anaglyph),  and  polarized  light  (a  vectograph) 
(Rock,  1975) .  The  use  of  colored  or  polarized  light  requires  the  use  of 
spectacles  with  different  colored  or  polarized  lenses  for  each  eye.  The 
apparatus  shown  in  Figure  2.6  will  generate  the  view  described  above.  Note 
the  difference  in  relative  location  of  images  a  and  b  across  the  two  retinas. 
Each  of  these  techniques  presents  slightly  offset  images  to  the  two  separate 
eyes,  and  the  degree  of  offset  is  inversely  proportional  to  the  intended 
distance.  This  degree  of  offset  is  typically  expressed  in  minutes  and  seconds 
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image  Image* 


Figure  2.5.  Double  Images  and  disparity.  Holding  a  relatively  near  and  far 
object  as  indicated  and  fixating  on  the  near  object  will  produce  a  single 
image  of  the  near  object  and  double  images  of  the  far  object.  The  image  of 
Che  near  object  falls  on  the  foveae  (F)  of  both  eyes  whereas  the  images  of  the 
nonfixated,  far  object  fall  on  the  nasal  part  of  each  retina.  The  iol id  lines 
represent  the  light  reflected  from  the  two  objects:  the  dashed  lines  indicate 
the  apparent  paths  of  the  projected  images  from  the  far,  nonfixated  object 
(from  Kimble,  Garmery,  &  Zigler,  1980,  p.  80). 
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of  visual  angle.  At  one  meter  distance,  differences  in  depth  between  objects 
of  a  few  seconds  of  visual  angle  can  be  resolved  (see  Section  4.2  for  a 
discussion  of  stereoscopic  display  technology) . 

Constraints : 

*Large  amounts  of  disparity  will  result  in  the  perception  of  double  images. 

If  a  continuum  of  disparity  values  is  present,  these  large  disparities  will  be 
better  tolerated  (Burt  &  Julesz ,  1980).  Yeh  and  Silverstein  (1989)  measured 
disparity  thresholds  while  controlling  for  possible  assistance  from  eye 
convergence ,  using  a  "time-multiplexing"  system.  Threshold  measurement  is 
defined  as  the  horizontal  offset  between  the  left  and  right  eye  images, 
converted  into  visual  angle.  For  crossed  disparity ,  the  threshold  was  27.11 
minutes  of  arc,  but  only  24.21  for  uncrossed  disparity . 

*The  effects  of  "cross-talk"  between  the  two  images  sent  to  the  different  eyes 
(Yeh  &  Silverstein,  1989),  can  sometimes  produce  "ghosts."  This  will  occur 
with  multiplexing  imaging  techniques  when  the  display  phosphor  does  not  decay 
rapidly  enough;  therefore ,  a  "ghost"  of  the  right  eye  images  remains  on  the 
display  when  the  left  eye  views  and  vice  versa.  This  ghosting  appears 
sensitive  to  the  color  of  the  display  and  is  minimized  with  red  or  amber 
images . 

*Only  horizontal  disparity  in  the  visual  scene  will  produce  depth  perception . 
Vertical  disparity  will  result  in  the  suppression  of  one  eye's  image  or  in  the 
perception  of  a  double  image.  This  constraint  results  from  the  horizontal 
placement  of  human  eyes  (Rock,  1975).  Hence,  artificial  sterec  displays  must 
be  viewed  with  the  orientation  of  the  head  orthogonal  to  the  direction  of 
disparity .  Therefore ,  a  stereo- pair  display  is  also  not  appropriate  for 
multiple-observers  or  off -angle  viewing  (Williams  fx  Garcia,  1989a, b). 

*The  "virtual  3D  environment"  of  stereo-pair  displays  (e.g.,  from  Tektronix, 
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ScereoGrsphics )  cannot  be  fused  by  lOZ  of  the  population .  However,  this 
problem  becomes  less  acute  if  the  images  are  dynamic  (Tittle,  Rouse,  & 
Braunstein,  1988). 

*Using  artif icial  stereoscopic  displays ,  different  kinds  of  surface 
textures/pattems  will  interact  with  an  observer's  ability  to  accurately 
perceive  stereoptically  presented  Information.  Random  textures  allow  better 
depth  perception  than  regular  patterns;  horizontal -vertical  textures  tend  to 
be  seen  as  flatter  and  farther  away  than  diagonal  ones;  and  continuous 
textures  tend  to  be  seen  closer  than  discontinuous  ones  (HlnLo  &  Hizrajl, 

1985) . 

*Stereopsis  is  most  effective  in  near  space,  for  distances  reachable  by  band. 
Objects  that  are  far  away  produce  more  or  less  similar  disparities : 

" stereoscop ically ,  they  are  at  the  same  distance"  (Prazdny,  1986).  However 
disparity  may  be  used  to  artificially  simulate  depth  differences  of  distant 
objects  (Zenyuh,  Raising,  Ualchli,  &  Biers,  1988). 

2 . 8  Nuisance  Cues 

The  previous  discussion  has  described  a  number  of  cues  to  distance. 

From  the  perspective  of  the  display  designer,  these  cues  are  important  in  two 
different  capacities;  as  sources  of  relevant  information  to  be  incorporated 
into  display  design,  and  as  unwanted  "nuisance"  variables,  whose  effects  may 
need  to  be  filtered  out  or  compensated.  In  the  first  case  we  consider,  for 
example,  the  Importance  of  including  depth  cues  such  as  stereopsis  or  texture 
to  create  a  compelling  sense  of  depth.  In  the  second  case,  we  consider  a 
situation  in  which  a  cue  Inappropriately  .signals  a  different  sense  of  distance 
from  that  Intended.  As  one  example,  a  rectangle  wire  frame  like  that  shovm  in 
Figure  2.6  might  be  used  to  provide  a  flight  path  "tunnel"  in  the  sky  (see 
Section  4).  However,  the  cue  of  height  in  the  visual  field  will  signal  that 
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the  near  end  is  actually  farther  away  when  (as  is  the  case  here) ,  the  tunnel 
is  viewed  from  beneath.  As  a  second  example,  colliir.ation  may  be  employed  to 
create  a  sense  of  depth  from  the  cue  of  accommodation.  But  if  there  are 
prominent  marks  on  the  display  surface,  these  marks  may  "signal"  to  the  eye 
muscles  a  point  on  the  display  surface  upon  which  the  two  eyeballs  will 
converge.  Hence,  the  "nuisance"  convergence  cue  will  provide  information  that 
the  display  is  near,  partially  neutralizing  the  effect  of  accommodation. 

This  second  example  relates  to  a  set  of  characteristics  that  Braunstein 
(1976)  refers  to  as  cues  to  flatness .  While  the  previous  section  has 
described  properties  of  the  displayed  information  that  can  create  a  more 
compelling  perception  of  depth,  Braunstein  emphasizes  properties  of  the 
surrounding  display  frame  and  surface  that  can  detract  from  this  perception  by 
signaling  to  the  observer  that  the  display  actually  is  a  flat  two-dimensional 
surface.  These  cues  to  flatness  can  include  the  physical  frame  surrounding 
the  display,  and  any  identifiable  marks  on  the  display  surface.  Techniques  to 
reduce  cues  to  flatness  include  monocular  viewing,  viewing  through  a  dark  Cube 
at  a  greater  distance,  viewing  through  a  collimated  lens,  and  viewing  when  the 
background  is  not  illuminated.  Such  techniques  greatly  enhance  the  perception 
of  three  dimensionality  (Larish  &  Flach,  in  press;  Palmer  6i  Petitt,  1977). 
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3.0  MULTIPLE  CUE  INTERACTION 


In  this  section  we  review  Individual  studies  that  have  examined  the 
relative  strengths  of  various  depth  cues.  In  these  studies,  two  or  more  cues 
are  used  to  portray  a  3D  image.  These  separate  cues,  however,  could  present 
information  concerning  the  object's  location  or  orientation  in  depth  that  is 
either  conf lictinp  or  congruent .  Studies  of  the  first  class,  which  may  be 
described  as  cue  tradeoff  studies,  assess  which  cue  "’wins",  when  two  are 
placed  in  conflict.  Studies  of  the  second  class,  which  may  be  described  as 
cue  compellingness  studies,  assess  the  compellingness  of  a  sense  of  depth, 
created  as  different  cues  are  added. 

One  focus  of  this  review  is  to  establish  the  best  "model"  for  cue 
combination.  An  additive  model  is  one  in  which  the  sense  of  depth  is  an 
additive  function  of  the  number  of  cues  employed  (Bruno  &  Cutting,  1988).  The 
cues  may  have  equal  weight,  or  some  may  be  dominant  over  others,  A  non¬ 
additive  model  is  one  in  which  the  contribution  of  a  given  cue  to  the  sense  of 
depth  will  differ,  depending  on  the  number  or  kind  of  cues  already  present. 

The  effect  may  either  be  subtractive  (a  cue’s  influence  Is  diminished)  or 
super-additive  (its  influence  is  enhanced). 

The  sections  are  organized  as  follows:  Section  3.1  reviews  the  work 
done  with  stereopsls  in  combination  with  other  depth  cues  which  arc  either 
static  (3.1.1)  or  dynamic  (3.1.2).  Section  3.2  reviews  studies  on  the  effects 
of  motion;  Section  3.3  reviews  studies  on  the  effects  of  shadows,  attached  and 
cast;  Section  3.4  addresses  the  cue  of  height  in  the  visual  field;  Section  3.5 
covers  studies  on  the  effects  of  a  surface's  texture  pattern;  Section  3.6 
addresses  the  effects  of  object  luminance  (the  Proximity  Luminance  Covariance, 
PLC) ;  and  Section  3.7  presents  a  summary,  and  the  conclusions  drawn  from  the 
multiple  cue  studies  that  have  implications  for  the  display  designer. 
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3 . 1  Stereopsls 

3.1.1  Stereopsls  with  static  cues .  Van  der  Meet  (1979)  investigated 
depth  perception  when  the  depth  cues  of  binocular  disparity  and  perspective 
(actually  a  converging  texture  gradient  with  decreasing  y-axis  heights  to 
represent  distance)  presented  either  congruent  or  conflicting  depth 
information.  Stereopsls  was  created  with  the  use  of  an  aniseikonic  lens 
(lenses  which  magnify  an  image  in  the  horizontal  meridian)  before  one  eye  (see 
Gillam,  1968).  Perspective  was  represented  by  the  decreasing  height  of 
vertical  bars  in  the  pattern  viewed:  the  bars  were  15  cm  (no  perspective 
effect),  12,  8.3.  5,  or  1.7  cm  ( f  le  strongest  perspective  effect).  The 
observer's  head  was  kept  stationary.  As  expected,  when  the  two  cues  indicated 
an  increasing  depth,  depth  perception  increased.  When  the  two  cues  presented 
conflicting  inform.jtion,  12  of  the  18  observers  used  stereopsls,  four  used 
perspective,  and  two  used  the  stereoptic  cue  in  all  but  the  two  strongest 
perspective  conditions.  The  effects  of  stereopsls  and  perspective  In  this 
experiment  were  additive:  with  Increasing  amounts  of  perspective,  whether 
congruent  or  in  contrast  to  the  orientation  of  the  stereopsls  used,  the  amount 
of  depth  perceived  increased/decreased  linearly. 

In  a  lengthy  review  of  research  on  slant  perception,  Braunstein  (1976) 
concludes  that  contour  convergence- -a  sort  of  linear  perspective- -and 
si'uteoscopic  viewing  are  roughly  additive  in  their  effects  on  the  perception 
of  slant. 

Kin,  Ellis,  Tyler,  Hannaford,  and  Stark  (1987)  examined  depth  perception 
when  the  depth  cues  of  binocular  disparity  and  perspective  presented  only 
congruent  information.  Observers  were  asked  to  track  a  target  moving  in  three 
dimensions.  In  the  first  condition,  the  target  moved  above  a  flat,  floor-like 
grid  which  facilitated  the  perception  of  the  perspective  depth  cue;  the  target 
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was  connected  to  the  grid  by  a  vertical  reference  line.  In  other  conditions, 
either  the  grid,  the  reference  line,  or  both  were  absent  from  the  display. 

All  these  conditions  were  shown  either  with  or  without  stereopsis.  Overall, 

3D  tracking  was  more  accurate  with  use  of  the  steiv^oscopic  display,  with  or 
without  the  reference  line  or  grid.  However,  with  the  reference  line  and  grid 
present  (and  placement  of  the  observer's  viewpoint  at  an  elevation  of  45°  and 
an  azimuth  of  0*"  or  45°  to  the  grid),  tracking  was  as  accurate  for  monoscopic 
viewing  with  perspective  as  under  stereopsis. 

Braunstein,  Andersen,  Rouse,  and  Tittle  (1986)  examined  depth  perception 
when  the  depth  cues  of  binocular  disparity  and  occlusion  presented  either 
congruent  or  conflicting  information.  The  computer  display  of  a  rotating 
sphere  with  randomly  placed,  ii regular  pentagons  on  its  surface  was 
stereoptically  presented  to  observers  using  a  "Brewster  stereoscope,"  (a  prism 
stereoscope,  Braunstein,  1986).  The  sphere  itself  was  either  opaque- - leading 
to  "edge  occlusion"  as  the  texture  elements  (the  pentagons)  disappeared  and 
appeared  over  the  horizons  of  the  sphere,  or  transparent- -  leading  to  "element 
occlusion"  as  the  texture  elements  on  the  closer  surface  of  the  sphere  moved 
between  the  observer's  viewpoint  and  the  texture  elements  on  the  far  side  of 
the  sphere.  The  direction  of  the  sphere  rotation  was  indicated  by  both 
stereopsis  and  either  one  or  the  other  type  of  occlusion,  with  the  two  cues 
presenting  either  congruent  or  conflicting  information.  When  both  stereoptic 
and  occlusion  cues  represented  the  same  direction  of  rotation  in  depth,  94%  of 
the  Judgments  matched  this  rotation  with  the  use  of  element  occlusion  and  91% 
of  the  Judgments  matched  with  the  use  of  edge  occlusion.  When  the  stereoptic 
and  occlusion  cues  represented  opposite  directions  of  rotation,  the  percent  of 
Judgments  following  the  stereoptic  information  fell  to  81%  with  the  use  of 
element  occlusion  and  to  36%  with  the  use  of  edge  occlusion.  Hence,  edge 
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occlusion  Is  a  more  dominant  cue  than  element  occlusion.  (Note;  Andersen  and 
Braunsteln  (1983)  found  that  the  occlusion  of  shapes  (e.g.,  squares, 
pentagons)  Is  more  effective  than  the  occlusion  of  small  texture  elements 
(e.g. ,  dots) . ] 

In  a  similar  study,  Tittle,  Rouse,  and  Braunsteln  (1988)  presented 
subjects  with  the  Image  of  a  rotating  cylinder  viewed  stereoscoplcally .  The 
cylinder  was  covered  with  texture  elements  and  was  opaque,  so  that  elements  on 
the  rear  face  of  the  cylinder  would  be  obscured  by  the  front  face 
(Interposition).  Subjects  judged  the  direction  of  rotation  of  the  cylinder 
when  stereopsls  cues  and  Interposition  cues  regarding  the  front  and  back  face 
were  In  concordance  and  in  conflict.  In  the  conflict  condition,  91X  of  the 
judgments  were  dominated  by  Interposition. 

Cavanagh  (1987)  placed  stereopsls  and  interposition  in  conflict  as 
observers  viewed  two  overlapping  wire  frame  squares  (the  front  and  back  frames 
of  a  "Necker  cube").  Using  subjective  reports,  he  found  that  observers  were 
roughly  divided  regarding  which  cue  dominated  perception. 

Dosher  et  al .  (1986)  examined  depth  perception  when  the  depth  cues  of 
binocular  dlspr  'Ity  and  proximity  luminance  covariance  (PLC)  presented  either 
congruent  or  c'^nflictlng  Information.  The  Image  of  a  luminous,  wire  cube  was 
presented  on  a  CRT.  The  brightness  of  the  Individual  lines  that  formed  the 
cube  were  varied  as  a  function  of  that  line's  objective  distance  from  the 
observer's  viewpoint  (the  displayed  Intensity  of  a  line  decreased  as  an 
Inverse  proportion  to  the  distance  of  that  line).  All  lines  that  formed  the 
"wire  cube"  could  be  seen.  The  Image  of  the  cube  v?as  shown  stereoptically 
using  a  system  of  mirrors  and  baffles;  a  constant  degree  of  perspective 
(nonpolar)  was  used  throughout  the  experiment.  Observers  saw  either  a 
stationary  cube,  a  stationary  cube  which  then  began  rotating,  or  just  a 
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rotating  cube.  Disparity  determined  depth  perception  for  all  four  observers 
for  the  stationary  and  stationary-then-rotating  cube  images.  For  depth 
perception  of  the  rotating-only  cube,  stereo  disparity  was  a  strong  cue  for 
only  two  of  the  observers.  The  effects  of  PLC  varied  widely  across  the  four 
observers.  Overall,  stereop^ls  was  usually  the  stronger  cue  (than  PLC),  but 
the  relative  importance  of  a  cue  appeared  largely  situation  dependent. 

Bulthoff  and  Mallot  (1988)  examined  depth  perception  when  the  depth  cues 
of  binocular  disparity,  attached  shadow  (l.e.,  shading),  and  highlights 
presented  only  congruent  information.  The  surface  quality  (smooth  versus 
faceted)  was  also  m-inipulated .  The  static  3D  image  of  an  ellipsoid  was 
presented  on  a  CRT.  The  ellipsoid's  surface  was  either  smooth,  resulting  in  a 
gradual  attached  shadow  (shading)  across  its  surface,  or  faceted,  resulting  in 
visible  edges  (sharp  changes  in  shading)  within  the  attached  shadow.  The 
light  source  was  either  diffuse  or  specular  (creating  a  highlight  on  the 
object's  surface)  and  was  positioned  at  the  observer's  viewpoint.  The  static 
ellipsoid  was  also  presented  with  or  without  binocular  disparity  (using 
interlaced  images  viewed  through  a  shutter  stereoptic  system).  The  amount  of 
perceived  depth  of  the  ellipsoid  ranged  from  high  (almost  matching  the  3D 
shape  of  the  object)  to  low  (the  inaccurate  perception  of  a  very  shallow 
ellipsoid)  in  order  with  the  following  combinations  of  cues;  binocular 
disparity  and  shading  with  edges;  binocular  disparity  and  shading  without 
edges  (that  is,  a  smooth  shadow);  no  disparity  and  shading  with  edges;  no 
disparity  and  smooth  shading.  (The  authors  noted  that  the  use  of  edges- -with 
their  sharp  changes  in  shading  -resulted  in  a  significant  increase  In 
perceived  depth  as  compared  to  the  use  of  gradual  shading. )  (The  use  of 
shading  without  edges  reduced  perceived  depth  by  about  25X.)  The  use  of 
diffuse  or  specular  lighting  did  not  make  a  f^ifference  in  perceived  depth. 
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In  a  second  aspect  of  the  sane  experi:'  ..t,  Bulthoff  and  Mallot  (1988) 
examir.cd  depth  perception  when  the  depth  cues  of  binocular  disparity  and  the 
location  o  '■he  light  source  were  varied.  The  surface  quality  (smooth  versus 
faceted)  was  also  manipulated.  The  static  3D  image  of  a  smooth  ellipsoid  was 
presented  on  a  CRT.  A  diffuse  light  source  illuminated  the  ellipsoid  fiom  one 
of  two  locations:  irom  the  upper  left  or  lower  right,  both  in  front  of  the 
object  (at  +-14°  azimuth  arid  +-13.6°  elevation  from  the  line  of  view).  The 
ellipsoid  was  also  presented  with  or  without  binocular  disparity  (using 
interlaced  images  viewed  through  a  shutter  st' '"’optic  system).  The 
nonillumlnated  side  of  the  ellipsoid  (the  dark,  shadowed  region)  was  perceived 
as  flat.  Wl.en  the  light  source  was  positioned  to  the  lower  right  of  the 
object,  the  ellipsoid  was  occasionally  perceived  as  a  concave  surface  (this  is 
due  to  the  implicit  assumpti.rn  by  the  observer  that  the  light  source  is 
located  above  th  i  visual  scene,  ca;^tng  shadows  downwards,  as  is  usually  the 
case).  The  use  of  binocular  disparity  prevented  this  reversal. 

3.1.2.  Stereousis  with  motion.  Braunstein  et  al.  (1986)  exarr.ined  depth 
perception  when  the  depth  cues  of  binocular  disparity  and  a  velocity  gradient 
preser'^ed  either  congruent  or  conflicting  information.  A  surface  consisting 
of  two  slanting  planes  meeting  at  a  horizontal  midline  was  presented 
stereoptically  to  observers  using  a  Brewster  stereoscope.  The  apex  of  this 
surface  could  either  be  pointing  towards  or  pointing  away  from  the  observer. 
Randomly  placed  dots  moved  across  this  surface  horizontally.  The  speed  of  an 
individual  dot  depended  on  its  distance  from  the  apex  ano  its  proximity  to  the 
obser'/er.  The  varying  speeds  of  all  the  dots  comprised  the  velocity  gradient. 
Tlie  direction  of  the  apex  ras  indicated  by  both  stereopsls  and  the  velocity 
gradient,  with  the  two  cues  pre.senting  either  congruent  or  conflicting 
inf'>rmation.  Subjects  judged  whether  the  apex  pointed  toward  (near)  or  away 
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(far)  from  them.  When  the  two  cues  were  congruent,  judgments  were  highly 
accurate.  When  stereoptlc  and  velocity  gradient  cues  were  placed  in  conflict, 
the  number  of  judgments  following  the  orientation  repre.sented  by  the 
stereoptic  cue  was  significantly  reduced.  When  the  velocity  gradient 
represented  a  near  apex,  the  proportion  of  judgments  following  the  stereoptic 
cue  dropped  from  approximately  93X  to  SOX;  when  the  velocity  gradient 
represented  a  far  apex,  the  drop  was  from  approximately  89X  to  52X.  Higher 
velocity  of  movement  created  a  greater  influence  of  velocity  gradient  on 
perceived  depth. 

Tittle  and  Braunstein  (1989)  examined  subjects'  ability  to  recover  the 
perceived  structure  of  objects  (a  rotating  cylinder)  from  their  motion  with 
stereo  viewing,  and  concluded  that  motion  had  a  super*addltlve  effect.  That 
is,  the  effects  of  stereoscopic  viewing  were  enhanced  under  motion. 

Praiidny  (1986)  examined  depth  perception  when  the  depth  cues  of 
bi.’ocular  disparity  and  motion  presented  only  congruent  information.  A  CRT 
display  was  filled  with  a  pattern  of  black  and  white  dots,  with  equal  amounts 
of  each  randomly  placed.  In  one  condition,  a  3D  object  wa*-  trayed  a.s  a 
moving  area  of  alternating  black  and  white  dots.  Ii>  a  sect  ndltion,  a  3D 

object  was  portrayed  within  two  patterns  of  black/whlte  dots  which  were  viewed 
stereoptically .  The  perceived  shape  of  the  3D  object  was  determined  by  the 
motion  of  the  object  (l.e.,  the  "kinetic  depth  effect")  while  its  location  in 
depth  was  determined  by  disparity. 

Summary  of  these  studies  will  be  presented  and  discussed  in  Section  3.7. 

3 . 2  Motion 

Todd  (1985)  examined  depth  perception  when  the  depth  cues  of  motion  (of 
a  rotating  object)  and  attached  shadow  presented  only  congruent  Information. 
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A  computer-generated  ellipsoid  was  rotated  about  its  vertical  axis.  The 
surface  of  the  ellipsoid  was  either  shaded  (with  an  attached  s  adow  as  if  the 
light  source  were  located  at  the  observer's  viewpoint)  or  not  i,  insulting  in  a 
simple  outer  contour).  The  object  was  presented  using  parallel  projection. 
Without  an  attached  shadow  (shading)  the  object  appeared  as  an  elastic  disc 
horizontally  stretching  and  contracting  across  the  screen  of  the  CRT.  With 
the  attached  shadow  all  observers  perceived  a  solid,  rotating  object  in  3D 
space.  A  static  image  of  the  shaded  ellipsoid  was  still  perceived  as  a  3D 
object,  though  its  perceived  3D  shape  differed  from  that  of  a  dynamic 
presentation.  For  example,  a  static  image  would  appear  as  a  sphere,  from 
viewing  the  ellipsoid  on  end. 

In  another  study,  Todd  (1984)  examined  depth  perception  when  the  depth 
cues  of  motion  (of  a  rotating  object)  and  perspective  (the  use  of  polar  versus 
parallel  projection)  presented  only  congruent  information.  Five  rotating 
surfaces  of  differing  degrees  of  curvature  (bending  towards  the  observer)  were 
presented  on  a  CRT.  These  surfaces  were  depicted  in  either  a  polar  (i.e., 
with  perspective)  or  parallel  fashion.  The  surfaces  were  also  either  rigid 
(e.g.,  a  rolling  cylinder)  or  nonrlgld  (e.g.,  a  cylinder  that  stretched 
horizontally  as  it  rolled).  They  were  viewed  blnocularly.  A  surface 
presented  with  perspective  projection  was  perceived  to  be  more  curved  than  if 
presented  with  a  parallel  projection.  However,  neither  projection  (polar  or 
parallel)  resulted  in  increased  observer  accuracy  in  judging  the  amount  of 
curvature  present.  Tlie  rigidity  or  nonrigidity  of  a  surface  had  a  negligible 
effect . 

Bruno  and  Cutting  (1988)  examined  depth  perception  when  the  depth  cues 
of  motion  (motion  parallax),  relative  size,  height  in  the  visual  scene,  and 
occlusion  presented  only  congruent  information.  Three  simple  squares,  with  no 
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surface  texcures,  were  presented  upon  a  blank  background  on  a  vector-plotting 
display.  They  represented  three  parallel  panels  at  equal  distances  from  one 
another  in  the  depth  plane.  The  depiction  of  these  three  panels  varied  with 
the  application  of  the  following  depth  cues:  relative  size,  height  in  the 
visual  scene,  occlusion,  and  motion  parallax.  The  display  was  viewed 
binocularly.  Observers  judged  the  apparent  relative  distance  between  the 
three  panels.  Analysis  of  the  data  was  not  done  to  find  the  relative 
importance  of  one  depth  cue  over  another,  but  to  confirm  the  additivity  of 
depth  cues  towards  the  perception  of  depth.  The  finding  of  all  cues  being 
relatively  strong  and  noninteracting  provided  strong  evidence  for  the 
additivity  of  depth  cues.  However,  while  the  presence  of  more  cues  did  lead 
to  the  perception  of  an  increased  separation  between  the  panels,  it  did  not 
increase  the  certainty  of  this  judgment,  which  is  odd.  Variability  between 
observers  towards  the  effectiveness  of  occlusion  (and  less  so  for  motion 
parallax)  was  noted. 

Braunstein  (1986)  used  computer  animation  to  produce  motion  picture 
sequences  from  which  velocity  gradients  could  produce  one  perception  of  slant, 
and  texture  gradients  another.  He  concluded  that  velocity  dominated  texture, 
such  that  the  final  perception  of  slant  was  influenced  tv;ice  as  much  by 
velocity  as  by  texture. 

Cavanagh  (1987)  used  a  paradigm  identical  to  that  described  in  Section 
3.1.1  Involving  subjective  interpretation  of  the  faces  of  a  Necker  cube  (Fig. 

2.2  left).  Relative  motion  of  the  faces  was  placed  in  conflict  with 
interposition.  Observers  were  divided  in  their  subjective  Impression  of 
depth,  some  governed  by  interposition  and  some  by  relative  motion. 

3.3  gha^ow? 

Rock  et  al .  (1982)  examined  depth  perception  when  the  depth  cues  of  cast 
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discern  the  apparent  location  of  the  specular  light  source  from  the  highlight 
is  important.  This  ability  depends  on  the  intensity  of  the  light  source  and 
the  reflectance  qualities  of  the  object's  surface. 

3.4  Height  in  the  Visual  Field 

Berbauin  et  al.  (1983)  also  examined  depth  perception  when  the  depth  cues 
of  height  in  the  visual  field  and  occlusion  presented  only  congruent 
information.  Several  very  simple  line  drawings  were  observed  monocularly. 
These  drawings  varied  in  the  compellingness  of  a  3D  interpretation  of  the 
object.  TTie  amount  of  depth  perceived  in  a  drawing  was  measured  by  the 
simultaneous  use  of  a  stereoscope  and  a  pointer  that  moved  in  depth.  The  most 
general  result  was  that  the  lower  part  of  the  visual  field  appeared  closer  and 
the  higher  part  of  the  visual  field  appeared  farther  away  from  the  observer. 
This  was  also  the  case  when  an  obsen'er  viewed  a  totally  blank  field.  The 
presence  of  occlusion  also  increased  the  perceived  depth. 

3.3 

Todd  and  Akerstrom  (1987)  examined  depth  perception  when  the  depth  cues 
of  surface  texture  pattern  and  perspective  (the  use  of  polar  versus  parallel 
projection)  presented  only  congruent  information.  Five  opaque  ellipsoids  were 
presented  on  a  CRT;  all  were  of  a  constant  width  but  varied  in  length  (and 
therefore  depth)  along  the  z-axis.  Their  surfaces  were  depicted  in  either  a 
pclar  (i.e.,  with  perspective)  or  parallel  fashion.  On  the  surfaces,  the 
texture  elements  were  randomly  located;  the  texture  elements  were  either 
squares  or  varied  in  size  and  shape.  In  further  experiments,  the  regular 
texture  elements  were  manipulated  by  controlling  their  rate  of  compression  as 
distance  Increased,  their  visible  area,  their  width,  and  their  orientation 


44 


(once  compressed)  relative  to  the  surface  of  the  display  (all  of  which 
normally  covary  with  compression) .  The  ellipsoids  were  viewed  monocularly. 

An  ellipsoid  presented  with  perspective  (polar  projection)  was  perceived 
to  be  slightly  more  curved  than  if  presented  with  a  parallel  projection.  The 
amount  of  depth  perceived  did  not  vary  across  texture  type  (regular  versus 
irregular).  From  the  second  experiment  whose  stimuli  are  depicted  in  Figure 
3.1.,  the  effects  of  element  compression  were  found  to  be  unimportant,  but 
proper  element  orientation  (i.e.,  the  elements  properly  aligned  on  the  objects 
surface)  and  element  width  were  found  to  control  the  perception  of  depth. 

Cutting  and  Millard  (1984)  examined  depth  perception  when  the  depth  cue 
of  surface  texture  followed  a  perspective  gradient  (the  changing  x-axis  width 
of  the  elementary  texture  units),  a  compression  gradient  (the  changing  y/x 
axes  ratio),  and/or  a  density  gradient  (the  changing  number  of  texture  units 
per  visual  angle).  All  three  gradients  presented  only  congruent  information. 
Two  static  surfaces  were  portrayed  on  a  vector-plotting  display  with  a 
vertical  dividing  line  between  them.  Both  surfaces  represented  either  a  flat 
or  curved  planu  whose  texture  elements  were  either  regular  or  irregular 
octagons.  The  texture  elements  were  randomly  placed  to  avoid  the  effects  of 
linear  perspective.  Observers  were  asked  to  choose  which  half  of  the  display 
appeared  more  like  a  flat/curved  surface  and  to  assess  the  strength  of  this 
perception  relative  to  the  unpreferred  side.  Perception  of  the  flat  surface 
was  controlled  65X  by  manipulation  of  the  perspective  gradient,  28X  by  the 
density  gradient,  and  6X  by  Che  compression  gradient.  In  contrast,  perception 
of  the  curved  surface  was  controlled  96X  by  manipulation  of  the  compression 
gradient,  and  about  2X  for  both  the  perspective  and  density  gradients.  The 
use  of  regular  or  irregular  texture  elements  had  no  effect. 
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Figure  3.1.  The.se  images  represent  two  ellipsoids  pointing  out  of  the  page. 

In  the  left  image,  all  the  texture  elements  are  squares  and  are  increasingly 
smaller  (compressed)  along  the  edge  of  the  image  to  represent  the  more  distant 
surface  of  the  ellipsoid  as  it  curves  away  from  the  observer.  In  the  right 
image,  the  rate  of  compression  is  the  same  as  in  the  left  image  but  the 
texture  units  decrease  in  width  and  are  properly  orientated  on  the  surface. 
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Palmer  and  Petltt  (1977)  examined  the  extent  to  which  collimation  of  the 


visual  image  to  optical  infinity  could  augment  depth  perception  induced  by 
textural  gradient  and  linear  perspective.  Their  measure  of  compellingness  was 
the  extent  to  which  subjects  are  unable  to  avoid  the  perception  of  depth,  and 
thereby  report  the  display  size  of  objects  as  being  what  their  true  size  would 
be,  if  viewed  at  a  distance,  rather  than  what  their  retinal  image  is.  If 
subjects  are  unable  to  suppress  this  depth  information,  their  report  of 
objective  display  size  will  grow  as  perceived  distance  is  increased.  The 
investigators  had  subjects  estimate  the  display  size  of  images  superimposed  on 
a  static  runway  which  provided  receding  cues  of  linear  perspective  and  textual 
gradients.  Some  depth- induced  distortion  was  found  with  the  display  viewed  at 
a  near  image  plane.  However,  when  the  display  was  viewed  through  a 
collimating  lens,  projecting  It  to  optical  Inflrity,  the  distortion  was 
doubled  in  its  magnitude.  The  authors  assumed  that  collimation  does  not  act 
as  a  depth  cue  itself,  but  removes  a  "cue  to  flatness"  (Braunstein,  1976). 

This  removal  makes  the  sense  of  depth  conveyed  by  other  cues  that  much  more 
compelling. 

3.6  Luminance  and  Brightness 

Schwartz  and  Sperling  (1983)  examined  depth  perception  when  the  depth 
cues  of  proximity  luminance  covariance  (PLC)  and  perspective  (the  use  of  polar 
versus  parallel  projection)  presented  either  congruent  or  conflicting 
information.  The  image  of  a  rotating,  luminous,  wire  cube  was  presented  on  a 
display.  The  brightness  of  the  individual  lines  that  formed  the  cube  was 
varied.  In  the  first  condition  the  objectively  near  side  of  the  cube  held  the 
brightest  lines  while  the  lines  of  the  far  side  of  the  cube  were  dim;  in  the 
second  condition  this  situation  was  reversed;  in  the  third  condition  all  lines 
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of  the  cube  were  equally  bright.  The  amount  of  perspective  with  which  the 
cube  was  depicted  was  varied  by  presenting  either  parallel  or  polar 
projection.  The  stimuli  employed  are  shown  in  Figure  3.2. 

Proximity  luminance  covariance  overwhelmingly  controlled  the  perception 
of  depth,  with  the  brighter  side  of  the  3D  wire  cube  perceived  as  being 
closest  to  the  obseir'/er  no  matter  what  degree  of  perspective  was  used.  In  Che 
absence  of  perspective  (parallel  projection),  PLC  decermined  depth  perception 
in  97. 4X  of  the  trials;  averaged  over  all  degrees  of  perspective  used,  PLC 
decermined  depth  perception  in  90. 5X  of  trials.  Perspective  by  itself 
affected  the  depth  perceived  in  a  weak  and  inconsistent  manner.  A  suggestion 
was  made  that  the  PLC  effect  may  not  be  from  luminance  per  se,  but  from  the 
effects  of  contrast  between  the  cube's  lines  and  the  background  luminance. 

3.7  Summary  Conclusions  s£  Cu£  Combination  Studies 

The  results  presented  above  are  summarized  in  Table  3.1.  The  cable 
presents  various  depth  cues  down  the  columns  and  across  the  rows.  Hence,  an 
entry  in  the  Cable  designates  a  particular  study  in  which  the  defining  row  and 
column  cue  were  either  traded  against  each  ocher  in  a  tradeoff  study,  or  were 
covaried  in  a  compellingness  study.  The  former  are  Indicated  by  a  single 
circle,  the  latter  by  a  double  circle.  In  tradeoff  studies,  an  arrow  points 
in  the  direction  of  Che  dominant  cue.  In  compellingness  studies,  an  arrow  (if 
present)  points  in  the  direction  of  the  cue  which  had  greater  weight.  An  "x" 
Indicates  a  relatively  weak  effect.  An  "A"  indicates  a  generally  additive 
relation,  a  ''+x"  indicates  a  positive  interaction,  such  that  the  presence  of 
one  cue  amplifies  the  effect  of  the  other.  A  "-x"  indicates  a  negative 
interaction,  such  that  the  presence  of  one  cue  diminishes  or  even  abolishes 
the  effect  of  another.  Reviewing  the  data  presented  in  Table  3,1,  and 
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Figure  3.2.  Various  stimuli  employed  by  Schwartz  and  Sperling  (1963). 

Showing  parallel  projection  (top)  and  polar  projection  (bottom)  with  PLC  in  B, 
C,  F,  and  G,  and  surface  occlusion  in  D  and  H. 
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interpreting  these  in  light  of  the  experimental  descriptions,  allows  the 
following  conclusions  to  be  drawn; 

(1)  Stereopsis  is  a  compelling  depth  cue  and  tends  to  dominate  those 
cues  with  which  it  is  paired.  However,  there  is  no  evidence  that  stereopsis 
is  so  totally  dominant  that  its  presence  eliminates  the  influence  of  other 
cues.  In  fnct,  on  three  occasions  stereopsis  was  dominated  by  other  cues: 
once  weakly  and  once  strongly  dominated  by  edge  occlusion,  and  once  dominated 
by  motion  in  interpreting  3D  shape. 

(2)  Motion  is  also  a  dominant  cue.  Its  effects  with  other  cues  are 
either  additive,  or,  if  there  is  an  interaction,  the  form  of  the  interaction 
is  either  positive  (motion  enhances  the  value  of  the  other  cue),  or  negative 
(motion  diminishes  the  effect  of  the  other  cue,  but  is  not  itself  diminished 
by  the  other  cue's  presence,  l.e.,  in  slant  perception  and  in  object 
recognition) . 

(3)  Occlusion  or  interposition,  although  less  frequently  examined,  also 
shows  evidence  of  being  a  strong,  dominant  cue,  and  becomes  more  so  if  the 
occluding  and  occluded  surfaces  are  relatively  large. 

(4)  Texture  gradient,  PLC,  and  perspective  are  relatively  weaker  in  the 
dominance  ordering,  although  they  are  nt  ,  unimportant,  and  do  contribute  to 
additive  relations  in  compellingness  studies. 

(5)  Long  term  experience,  as  it  influences,  for  example,  relative  size, 
appears  to  be  a  fairly  weak  cue.  It  may  be  easily  dominated  by  perspective 
and  occlusion,  while  the  assumption  of  rigidity  of  objects  (an  assumption 
based  on  long  term  experience)  is  also  easily  violated. 

(6)  Highlighting  is  relatively  weak,  and  appears  not  to  be  worth  the 
effort  and  computer  power  necessary  to  Implement  for  dynamic  displays. 
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(7)  Generally,  an  additive  nDdel  of  cue  combination  with  nonequal 
weights  appears  to  hold  up  well,  with  one  important  exception:  the  presence 
of  motion  appears  to  violate  the  model  in  over  half  of  the  cases.  This  is 
true  in  cases  when  the  presence  of  motion  is  examined  as  a  cue ,  or  when  other 
cuei-  are  examined  in  a  dynamic  display  (e.g.,  Kin  et  al.,  1987).  Furthermore, 
Tittle  et  al.  (1988)  observed  that  stereo-deficient  viewers  of  static  displays 
were  not  stereo  deficient  when  the  displays  were  dynamic.  This  conclusion 
suggests  the  need  to  be  wary  about  generalizing  results  from  a  static  display 
to  a  dynamic  one . 

(8)  Cue  dominance  appears  to  be  somewhat  task  dependent.  For  example, 
if  3D  shapes  of  objects  are  to  be  recognized  in  a  dynamic  environment, 
stereopsls  is  less  important.  If  3D  locations  of  objects  are  to  be 
interpreted,  or  if  the  environment  is  static,  stereopsls  becomes  more 
critical.  Occlusion,  either  through  hidden  lines  or  hidden  surfaces,  should 
always  be  used  in  either  case. 

(9)  Absence  of  cues  to  flatness  can  enhance  the  compellingness  of  depth 
cues  in  an  interactive  fashion.  These  may  include  viewing  in  a  darkened 
environment,  removing  display  contours,  or  projecting  At  opc.ical  Infinity 
through  collimation. 

The  preceding  conclusions  are  not  offered  with  ccmp)ete  certainty,  and 
clearly  more  research  is  needed  on  multiple  cues.  In  particuiar.  the 
conclusions  offered  above  lead  to  the  identification  of  a  number  of  tpcciilc 
research  questions  that  remain  unresolved.  For  example: 

(1)  What  is  the  role  of  Immediate  past  viewing  experience  on  the  use  of 
depth  cues  (Dosher  et  al.,  1986)7  Will  this  experience  "lock  on"  to  a 
perceptual  interpretation  of  near  and  far  that  cannot  be  revised  by  subsequent 
viewing,  forming  a  sort  of  perceptual  inertia? 
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(2)  How  dynamic  must  an  image  be  to  fully  achieve  the  importau>. 
influences  of  motion? 

(3)  What  are  the  effects  of  3D  cues  on  precise  check  read^v.g  ot  linear 
distances- -orthogonal  to  the  line  of  sight?  How  are  these  influenced  by  field 
of  view,  azimuth,  and  elevation  angles  (McGreevy  &  Ellis,  1986)?  Are 
illusions  of  depth  also  additive  with  the  number  of  cues? 
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4.0  3D  DISPLAY  IMPLEMENTATION 

When  3D  displays  are  designed  for  use  in  operational  settings,  a  number 
of  ^'pecific  principles  and  considerations  should  be  kept  in  mind,  in  addition 
to  the  specific  iss.ie  of  what  cues  to  incorporate.  In  this  section  we  review 
some  of  these  considerations,  which  impact  the  actual  design  of  the  display. 
Section  4.1  considers  principles  for  the  design  of  perspective  displays,  and 
Section  4.2  addresses  those  considerations  chat  are  specific  co  displays  using 
observer-centered  cues.  This  second  section,  of  necessity,  focuses  heavily  on 
t;.o  existing  technology  for  creating  stereo  or  virtual  images. 

4 . 1  Perspective  Display  Implementation 

4.1.1  Geometry  ^f  perspective  viewing .  When  perspective  geometry  is 
employed  to  represent  specific  locations  of  objects  and  landmarks  in  space,  a 
series  of  considerations  must  be  addressed  (McGreevy  &  Ellis,  1986;  McGteevy, 
Ratzlaff,  &  Ellis,  1986;  Ellis  &  Grunwald,  1989;  Ellis,  1989).  Many  of  these 
are  best  understood  in  the  context  of  a  picture  of  the  geometric  relations 
Involved  in  such  a  display,  as  illustrated  in  ligure  4.1,  The  flguro  depicts 
an  eye  viewing  a  display  frame  of  an  outside  scene.  The  geometry  of  the  scene 
is  depicted  from  a  top-down  orientation.  Two  objects,  an  "x"  and  an  "o"  are 
depicted  in  the  outside  scene,  and  these  are  also  presented  on  the  display 
image  plane.  Five  terms  are  critical  to  understanding  the  geometry  of  the 
perspective  viewing  in  Figure  4.1. 

(1)  The  visual  angle  (VA)  of  the  screen  is  the  angle  subtended  by  the 
display  screen  as  it  is  observed  by  the  viewer. 

(2)  The  viewing  distance  (VD)  is  the  distance  of  the  eyes  from  the 
display  screen.  Hence,  VA  is  determined  by  display  screen  size  (DS)  and  VD. 
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Figare  4.1.  Geometric  relations  in  perspective  display  viewing  (a)  optimally 
positioned  viewpoint,  (b)  magnified  display,  (c)  minified  display 
VD  =  Viewing  distance.  VA  •  Visual  angle  of  the  display. 

FOV  ■  Field  of  view  of  the  display.  X  and  0  are  objects  in  the 
virtual  world,  whose  Images  are  depicted  on  the  display. 


'3)  The  display  field  of  view  (FOV)  Is  the  angle  depicted  by  the  display 
Image  from  a  hypothetical  point  where  all  light  rays  would  converge.  In 
Figure  A.l,  this  point  happens  to  be  precisely  at  the  eye  position  or  view 
point . 

(4)  This  point  of  convergence  Is  defined  as  the  station  point  or  center 
of  projection.  At  the  VD  where  the  view  point  and  station  point  are 
Identical,  the  visual  angle  and  the  FOV  will  be  Identical;  hence,  objects  In 
the  real  world- -and  their  depiction  on  the  display- -will  be  aligned.  This  Is 
the  case  In  Figure  4.1(a).  Naturally,  for  a  given  viewing  distance  this 
correspondence  will  not  always  be  observed.  Figure  4.1(b)  shows  a  situation 
In  which  the  correspondence  Is  violated  because  the  display  Is  magnified, 
relative  to  the  viewing  distance- -as  if  seen  through  a  telephoto  lens.  Here 
the  station  point  would  be  quite  far  from  the  display.  Points  a  and  b 
represent  where  the  observer  would  perceive  objects  x  and  o  to  be.  Figure 
4.1(c)  shows  a  minified  display  like  a  wide-angle  lens,  imposing  the  station 
point  at  a  very  small  VD. 

When  a  3D  perspective  display  Is  designed  to  present  an  outside-ln  or 
"God's  Eye"  perspective,  two  additional  parameters  must  be  assumed  as  shown  In 
Figure  4.2.  The  elevation  viewing  angle  is  the  angle  from  which  the  display 
"camera"  looks  down  upon  (or  up  to)  the  highlighted  object  in  a  display.  In 
the  cockpit  display  of  traffic  information,  used  to  Illustrate  this  point  in 
Figure  4,2,  this  object  Is  the  circled  plane  in  the  center  of  the  display, 
referred  to  as  "own  ship."  The  azimuth  viewing  angle  is  the  angle  from  which 
the  object  is  viewed,  relative  to  a  canonical  "straight  ahead"  orientation. 

In  Figure  4.2,  this  azimuth  angle  Is  about  8°  to  the  right. 

4.1.2  Distortions  of  perspective  viewing.  It  is  apparent  that  these 
various  factors  considerably  complicate  tlie  design  of  perspective  displays. 
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Figure  U.l.  Perspective  display  viev.'ing.  The  display  Illustrates  an 
elevation  viewing  angle  of  30°  (from  above),  and  an  azimuth  viewing  angle  of 
8°  (to  the  right)  (from  Ellis  et  al.,  1987). 
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and  suggest  the  need  for  principles  which  optimize  the  setting  of  all 
parameters.  Some  empirical  evidence  does  however  exist  to  guide  the  parameter 
specification.  Much  of  this  evidence  is  in  the  form  of  measured  distortions 
perceived  as  the  scene  is  viewed  from  particular  perspectives.  For  example, 
Roscoe ,  Corl,  and  Jensen  (1981)  point  out  that  perspective  displays  of  this 
sort  are  perceptually  "minified."  That  Is.  even  with  the  viewpoint  at  the 
station  point  as  shown  in  Figure  4.1(a),  objects  are  perceived  as  closer 
together  (or  smaller)  than  they  really  are.  Hence,  using  the  size-distance 
invariance  equation,  the  distance  of  those  objects  from  the  observer  is  seen 
as  greater  than  it  really  is.  In  aviation,  this  could  lead  to  faster  than 
desirable  approaches  to  a  runway.  Roscoe  et  al.  conclude  that  a  display 
magnification  of  approximately  1.3  is  necessary  to  compensate  for  this 
perceptual  minification.  They  also  point  out  that  for  aviation  applications, 
the  FOV  should  be  sufficiently  wide  (around  40®)  so  that  the  forward  path  of 
an  aircraft  can  be  viewed  even  as  the  aircraft  is  "crabbed"  20®  to  one  side  or 
the  other  into  a  crosswind. 

A  set  of  biases  and  distortions  associated  with  viewing  world-referenced 
perspective  displays,  such  as  the  air  traffic  controller  display  shown  in 
Figure  4.2,  have  been  identified  and  modeled  in  a  program  of  research  carried 
''ut  by  Ellis,  McGreevy,  and  their  colleagues  at  NASA  Ames  Research  Center 
(Ellis,  1989;  Ellis  &  Grunwald,  1989;  McGreevy,  Ratzlaff,  &  Ellis,  1986; 
McGreevy  &  Ellis,  1986;  Ellis,  Smith,  6  Haclsalihzade ,  1989;  Ellis,  McGreevy, 

&  Hitchcock,  1984;  1987).  Their  empirical  studies  have  modeled  the  kinds  of 
biases  an  observer  would  make  in  trying  to  estimate  the  bearing  and  elevation 
that  an  "Intruder"  aircraft  would  show  relative  to  own-ship  in  a  display  like 
Figure  4.2.  The  schematic  display  used  in  their  research  is  shown  in  Figure 
4.3. 
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Figure  4.3.  Perspective  viewing  display  used  by  Ellis  and  his  colleagues. 
Subject  Is  asked  to  estimate  the  azimuth  (a)  and  elevation  angle  (/?)  of  the 
target  cube  from  the  reference  cube.  This  typifies  the  kinds  of  judgment  that 
an  air  traffic  controller  might  make  regarding  the  relative  positioning  of  two 
aircraft . 


Observers  indicate  their  response  by  adjusting  the  two  angular  indicators 
shown  beside  the  display. 

Ellis  and  his  colleagues  have  identified  two  general  forms  of  error  that 
are  shown  when  people  make  these  visual  estimations  from  perspective  displays. 

(1)  Estimation  of  elevation  angle  tends  to  be  perceptually  exaggerated. 
This  bias  toward  overestimation  of  positive  angles  (underestimation  of 
negative)  is  greatest  at  +  30°.  The  bias  is  also  modified  by  the  display  FOV, 
and  is  smallest  with  larger  FOVs  (e.g.,  120°). 

(2)  Azimuth  estimations  are  biased  in  a  direction  towards  Che  angle 
formed  on  the  2D  image  plane.  That  is,  the  relative  bearing  is  perceived 
closer  to  the  bearing  parallel  to  the  image  plane,  than  is  the  veridical. 

This  latter  form  of  bias  is  modeled  to  be  the  result  of  two  perceptual 
effects : 


(i)  the  2D»Dro jectlon  effect  is  the  result  of  an  "averaging"  process, 
whereby  the  estimated  azimuth  angle  is  Che  average  of  its 
geometrical  perspective  angle,  and  the  angle  projected  onto  the 
image  plane ;  and 

(ii)  the  virtual  space  effect  results  from  the  viewer's  relatively 
automatic  as.sumption  that  the  view  point  is  located  at  the 
station  point.  The  observer  then  hypothesizes  the  virtual  space 
that  would  be  required  to  produce  the  given  displayed  image. 
Thus,  for  example,  in  Figure  4.1(b)  the  observer  would  be  biased 
to  assume  that  the  "x"  and  "o"  in  the  world  arc  closer  to  the 
positions  marked  "a"  and  "b,"  and  extended  from  the  dotted  line. 

The  magnitude  of  both  of  these  effects  combines  interactively  with  VA, 
VD,  FOV,  viewing  azimuth,  and  elevation  angles,  along  with  the  actual  angle 
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and  elevacion  between  the  target  objects  to  form  a  fairly  complex  series  set 
of  effects,  some  of  which  are  shown  in  Figure  U.U  (McCreevy  et  al.,  1986). 

The  figure  presents  different  combinations  of  VD  and  FOV,  and  shows  how 
positions  wj thin  this  two-dimensional  parameter  space  influence  the  two 
sources  of  error. 

Finally,  Ellis,  Smith,  and  Hacisalihzade  (1989)  have  noted  that  the 
kinds  of  estimations  people  make  via  world-referenced  coordinates,  like 
adjusting  the  angles  in  Figure  4.3,  are  qualitatively  different  from  those 
shown  by  egocentric  judgments  (pointing  or  looking). 

4.1.3  Corrective  measures .  Equally  Important  to  the  identification  of 
these  effects  are  the  prescriptions  that  can  be  made  to  minimize  these  and 
other  biases,  which  will  distort  the  perceived  location  of  objects.  These 
prescriptions  may  be  placed  into  three  general  categories;  parameter 
specifications,  geometric  enhancements,  and  symbolic  enhancements  (Ellis, 

1989;  Ellis  &  Grunwald,  1989). 

Parameter  specifications .  The  schematic  representation  in  Figure  4.4 
reveals  that  minimum  biases  will  occur  with  a  large  FOV,  matched  by  a  large 
viewing  angle.  Biases  in  general  tend  to  be  smaller  when  the  VA  and  FOV  are 
matched.  Furthermore,  as  might  be  expected,  higher-elevation  viewing  angles 
will  reduce  the  magnitude  of  azimuth  errors  (in  the  extreme,  a  90°  viewing 
angle  is  a  top-down  view,  equivalent  to  a  2D  map  display,  and  will  have  near 
perfect  azimuth  angle  judgments).  In  optimizing  their  perspective  display  for 
a  cockpit  display  of  traffic  information,  Ellis  and  his  colleagues  have  chosen 
an  elevation  angle  of  30°  and  an  azimuth  angle  of  8°  off  center  from  behind 
own-ship.  This  choice  is  based  upon  pilot  opinion, 

Subsequent  research  by  Kim  et  al.  (1987),  using  the  3D  tracking  paradigm 
described  in  Section  3.1,  revealed  that  cracking  performance  was  nearly 
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Figure  4. A.  Presents  viewing  conditions  used  by  McGreevy  et  al,  (1986), 
showing  how  these  affect  the  two  forms  of  bias  produced  by  the  virtual  space 
effect  and  the  two-dimensional  effect. 
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equivalent  at  elevations  between  30°  and  60°,  but  fell  off  with  more  extreme 
angles.  Tracking  performance  was  essentially  equivalent  across  azimuth  angles 
from  0  to  40°. 

In  a  slightly  different  paradigm,  Yeh  and  Silverstein  (in  preparation) 
asked  subjects  to  make  Judgments  cf  altitude  and  distance  from  the  viewpoint 
of  perspectively  viewed  aircraft  symbols.  Changing  elevation  angle  produced  a 
tradeoff  in  the  quality  of  performance  between  the  two  dimensions  of  Judgment 
(altitude  and  distance).  Altitude  Judgments  became  worse,  and  distance 
Judgments  improved  at  the  higher  elevation  angle  (45  °  vs.  15°).  However, 
tradeoff  was  such  that  the  combined  performance  on  both  dimensions  was  better 
at  15°  than  45°. 

Geometric  enhancements .  These  include  actual  biases  or  distortions 
Incorporated  into  a  perspective  display,  which  may  compensate  for  the  observed 
perceptual  biases.  The  proposal  by  Roscoe,  Corl,  and  Jensen  (1981)  that  there 
should  be  a  1.3  magnification  factor  in  perspective  viewing  displays  is  one 
such  example.  Another  is  the  characteristic  that  the  vertical  dimension  is 
routinely  amplified  relative  to  the  horizontal  for  ATC  displays  (Ellis, 
McGroevy,  &  Hitchcock,  1987).  A  third  geometric  enhancement  is  to  introduce  a 
nonlinear  scaling  of  object  size  with  distance,  to  ensure  that  displayed 
objects  do  not  become  vanishingly  small  at  extreme  distances. 

Symbolic  enhancements .  These  represent  artificial  elements  Introduced 
into  the  display  that  are  not  part  of  the  virtual  world,  but  serve  to 
disambiguate  the  display,  or  enhance  its  perceptibility.  Most  effective-  here 
are  the  "posts"  upon  which  each  aircraft  in  Figure  4.2  is  mounted.  When  these 
are  coupled  with  a  grid  surface  to  which  the  post  is  attached,  and  are  marked 
with  an  "x"  Indicating  own-ship' c  altitude,  then  the  true  altitude  and 
location  of  the  intruder  aircraft  becomes  unambiguous.  Another  symbolic 
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enhancement  developed  by  Ellis,  McGreevy,  and  Hitchcock  (1987)  and  shown  in 
Figure  4.2  is  the  second  post,  connected  to  the  ground  from  a  point  on  the 
vector  predicting  the  aircraft's  flight  path.  This  feature  unambiguously 
specifies  heading.  While  this  feature  does  not  eliminate  the  azimuth 
estimation  error  described  above  (Figure  4.4),  it  does  eliminate  the  tendency 
of  this  error  to  increase  as  the  viewing  perspective  elevation  angle 
decreases  from  the  top-down  toward  the  horizontal  (Ellis  &  Grunwald,  1989). 

Numerous  other  symbolic  enhancements  can  be  designed  into  such  displays, 
Including  the  presentation  of  digital  readouts,  color,  or  brightness  coding. 

At  this  point,  however,  the  display  designer  must  be  very  cautious  that  the 
Inclusion  of  extra  objects  does  not  create  sufficient  clutter  to  neutralize 
the  ease  of  interpretation  which  led  to  the  choice  of  3D  representation  in  tiie 
first  place.  Wickens,  Haskell,  and  Harte  (1989a, b)  discuss  ways  in  which 
symbolic  enhancements  can  be  integrated  into  objects  that  are  already  part  of 
the  display,  thereby  minimizing  the  addition  of  display  clutter. 

4.2  Observer -centered  Qas.  Implsm?n^gU9n 

Four  major  classes  of  techniques  have  been  used  to  convey  a  sense  of 
depth  through  nonperspective,  observer-centered  cues.  These  involve 
holographic  displays,  multiplanar  displays,  binocular  displays,  and  active 
parallax  displays.  Displays  within  each  of  theso  ^  itegories  in  turn  can  be 
described  in  terms  of  whether  special  viewing  glasses  must  be  worn.  If  the 
glasses  are  not  worn,  then  the  displays  are  described  as  auto  stereo  displays . 
Holographic  and  multiplanar  displays  are  autostereoscopic .  Binocular  displays 
may  or  may  not  be. 

4.2.1.  Holoyraohic  displays  use  a  technique  of  optical  interference 
between  images  projected  from  two  different  light  sources  to  create  a  3D  image 
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(Hodges,  Love,  &  McAllister,  1987;  Okoshl,  1980).  While  the  image  is 
"virtual"  and  exists  in  3D  space,  it  still  leads  Co  some  distortion  in 
relative  distance  judgment  (Frey  &  Frey,  ?985).  In  addition,  there  are 
numerous  technological  difficulties  in  generating  real  time  holography,  which 
can  be  oynamically  updated  (Hopper,  1986).  Also,  holographic  displays  are,  by 
design,  limited  in  the  field  of  view,  and  the  display  will  change  with  the 
angle  at  which  it  is  viewed  (Williams  6  Garcia,  1989a, b).  Good  technical 
reviews  of  holographic  display  technology  may  be  found  in  Okoshi  (1980)  and 
Hopper  (1986). 

4.2.2  Multiolanar  displays .  Multiplanar  displays  are  typically  created 
through  a  system  of  rotating  or  vibrating  mirrors  (Williams  &  Garcia,  1989a, b; 
Huggins  &  Getty,  1982).  Such  displays  are  also  referred  to  as  "volume 
visualization  displays"  because  they  can  create  a  virtual  volume  within  which 
a  3D  image  can  be  constructed.  A  current  version  marketed  by  Genisco  creates 
a  maximum  volume  of  25  cm  ,  with  a  30-Hz  refresh  rate.  A  larger  version  semi- 
spherical  volume  with  a  base  of  3  ft.  has  been  developed  by  Williams  and 
Garcia  (1989a,b).  Two  envisioned  examples  are  shown  in  Figure  4.5.  Tiie 
advantages  of  this  display  are  its  virtuality,  and  the  fact  that  multiple 
users  can  walk  around  and  Inspect  the  display.  Its  costs  are  related  to  the 
early  state  of  development  of  the  technology  (and  the  resulting  financial 
costs),  and  to  the  fact  that  it  is  not  appropriate  for  use  in  creating  solid 
objects,  area  shading,  and  filling  (Williams  &  Garcia,  1989a, b).  Huggins  and 
Getty  (1982,  1984)  have  used  the  volume  visualization  display  to  examine 
various  aspects  of  3D  compatibility  in  manipulating  and  orienting  3D  objects. 

4.2.3  Stereoscopic  displays .  In  contrast  to  holography  and  multiplanar 
displays,  stereoscopic  displays  do  not  attempt  to  recreate  a  "virtual"  3n 
environment,  but  as  noted  in  Section  2.7.2,  simulate  depth  by  presenting  two 
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Figure  4.5.  Two  possible  examples  of  applications  of  multiplanar  display': 
(Williams  &  Garcia,  1989b). 
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disparate  images,  one  typically  presented  to  each  eye,  with  each  image 
depicting  the  disparate  view  that  would  be  obtained  if  the  virtual  image  were 
viewed.  Stereoscopic  displays  may  be  categorized  in  terms  of  whether  the  two 
images  are  viewed  simultaneously  or  in  alternation.  These  are  described  as 
time-parallel  and  time -multiplexed  displays  respectively  (Hodges  &  McAllister, 
1985;  Johnson,  1989). 

Time  parallel  displays  are  typically  created  by  presenting  the  two 
disparate  images  in  different  wavelengths  of  light  and  then  using  different 
filters  on  each  lens  of  a  pair  of  glasses  to  present  a  different  image  to  each 
eye.  This  technique  is  used  in  what  is  known  as  an  anaglyph  display,  and  was 
commonly  used  in  3D  movies  during  the  1950' s.  The  classic  stereoscopic 
display  employed  in  the  stereoscope  or  view  master  is  also  time-parallel,  and 
uses  optical  techniques  to  present  the  two  images. 

Techniques  for  using  time-parallel  displays  with  slide  projectors  for 
group  presentations  are  described  by  Uixson  (1989).  Fuller  and  Philips  (1989^ 
discuss  the  use  of  time  parallel  stereo  display  for  aerial  refueling  or  remote 
control  of  off-road  vehicles,  where  stereo  depth  judgments  must  be  extremely 
precise.  3D  Image  Tek  Corporation  of  Glendale,  California,  is  one 
manufacturer  of  such  a  system.  An  important  issue  with  regard  to  tlrae- 
parallel  displays  is  the  viewing  separation  and  degree  of  convergence  of  the 
two  camera  lenses.  This  is  typically  set  at  the  separation  of  the  eyes, 
around  2  1/2".  However,  some  advantage  is  gained  if  the  separation  can  be 
increased  for  viewing  at  great  distances,  when  the  amount  of  natural  disparity 
is  reduced. 

Another  issue  with  regard  to  stereo  camera  positioning  concerns  whether 
the  camera  configurations  should  be  parallel  or  converging.  For  a  number  of 
reasons,  converging  configurations  are  superior  for  close-in  work  such  as  that 
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involved  in  remote  manipulation  or  teleoperation  (Diner  &  von  Sydow,  1988). 

There  also  appears  to  be  a  performance  tradeoff  caused  by  the  amount  of 
convergence  of  camera  angle.  Large  angles,  w.lth  a  wider  :  eparation  than  the 
two  eyes,  generating  "hyperstereopsis”  or  "super  stereo,"  produce  very  precise 
depth  resolution,  but  at  the  expense  of  some  3D  distortion  of  cbjects' 
locati.  1  in  space.  Smaller  angles  reduce  distortion,  but  also  allow  less 
resolution.  Diner  and  von  Sydow  (1988)  identify  certain  compromise  viewing 
conditions  that  can  optimize  the  -josition  on  the  tradeoff  for  close-in  viewing 
used  in  teleoperation. 

Time -multiplexea  displays  are  implemented  by  presenting  the  two 
disparate  images  at  a  rapid  rate  of  alter  lation,  typically  around  30  H;.  To 
ensure  that  each  image  Is  viewed  only  by  the  appropriate  eye,  the  viewing  is 
typically  accomplished  through  glasses  in  which  each  lens  is  polarized  to  a  ! 

different  orientation.  A  rapid  shutter  .system  is  synchronized  with  the 
display  generator,  so  that  when  the  left  eye  image  is  displayed,  polarized 
light  will  only  pass  through  the  left  lens,  while  opposite  polarization  aligns 
the  right  eye  image  and  lens  on  the  alternate  cycle.  This  alternation  may  be 
accomplished  by  driving  the  lens  of  polarized  glasses  themselves,  through  PLZT 
or  Liquid  Crystal  Shutter  Technology  (LCS)  (Bridges  &  Reising,  1987). 

Alternati /ely ,  LCS  technology  can  be  lmplem.'>nted  on  an  overlay  on  the  front  of 
the  CRT  display  screen.  These  two  approaches  are  sometimes  described  as 
"active"  and  "passive"  stereoscopic  systems.  The  large  LCD  display  required 
of  passive  systems  has  a  considerably  greater  cost  (around  $6,000  for  a  19" 
plate)  than  the  smaller  glasses  of  the  active  system,  which  may  be  available 
for  as  low  as  $300.00  (Johnson,  1989). 

Binocular  time  multiplexed  displays  are  currently  the  most  frequently 
usf  d  stereoscopic  display  technology  because  of  tucilr  relatively  high 
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compaflvtlicy  with  CPT-based  image  generating  devices  marketed  by  Tektronix 
Corporation  of  Portland,  Oregon,  and  Stereographic  Corporation  of  San  Rafael, 
California.  V’hile  their  degree  of  effectiveness  in  supporting  task 
performance  will  be  discussed  in  Section  5,  the  costs  of  these  forms  of 
stereographic  displays  are  as  follows: 

(1)  Physical  constraints .  Both  LCS  and  PLZT  technology  are  more 
constraining  when  tht  viewer  must  wear  glasses  whose  lenses  ere  synchronized 
to  the  display,  because  of  the  wiring  necessary  and  the  greater  expense  and 
vulnerability  of  the  glasses.  PLZT  also  requires  high  voltages  in  the 
glasses.  Because  LCS  technology  can  be  implemented  on  a  wide  screen  display 
overlay,  the  alternation  or  shuttering  mechanism  can  be  imposed  at  the 
surfaces  of  the  display,  rather  than  in  the  glasses,  which  then  only  need  to 
be  equipped  with  two  nonshuttered  polarized  lenses.  This  is  a  clear 
advantage . 

(2)  Viewing  perspective .  Alternating  frame  technologies  produce  a 
distorted  image  as  the  viewing  perspective  changes,  and  the  3D  imaging  is  lost 
if  the  head  is  tilted. 

(3)  Image  intensity .  The  use  of  polarization,  by  definition,  eliminates 
much  of  the  light  energy,  and  images  become  less  intense. 

(4)  Spatial  resolution.  As  typically  implemented  on  a  raster  display, 
left  and  right  eye  Images  are  generated  on  alternating  raster  lines.  This 
feature  degrades  the  vertical  resolution  of  the  display.  An  alternative  is  to 
generate  each  eye  image  on  all  raster  lines.  However,  this  technique  will 
halve  the  frame  rate  to  30  Hz  and  may  produce  perceptual  flicker  (Johnson, 
1989) . 

(b)  Gho.sting .  The  compellingness  of  the  alternating- frame  stereoscopic 
view  depends  upon  the  extent  to  which  the  off -cycle  image  does  not  stimulate 
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the  retina.  The  combined  effects  of  slow  CRT  phosphor  decay,  and  retinal 
sensitivity  to  the  persisting  image  can  lead  to  a  perception  in  which  the 
residual  image  of  the  off-cycle  display  is  viewed  by  the  other  eye.  This 
produces  a  "ghost"  double  image.  Ghosting  may  be  reduced  or  eliminated  by  the 
appropriate  choice  of  display  hue.  Red  or  orange  hues  present  minimal  ghosting 
compared  to  white  (Yeh  &  Silverstein,  1989).  However,  the  display  luminance 
is  greatly  reduced  In  this  case. 

(6)  Resolution  and  fusion.  Large  amounts  of  disparity  cannot  be 
adequately  fused,  and  a  double  image  becomes  quite  apparent.  The  range  of 
disparity  over  which  fusion  is  possible  is  referred  to  as  Panum's  fusion  area . 
Within  a  Tektronix  SGS  430  display,  these  limits  are  approximately  21  minutes 
of  arc  for  crossed  disparity,  and  24  minutes  for  uncrossed  (Yeh  &  Silverstein, 
1989).  While  roughly  lOX  of  the  population  is  unable  to  fuse  static 
stereoscopic  images,  this  limit  appears  less  constraining  for  dynamic  images 
(Tittle,  Rouse,  &  Braunstein,  1988). 

In  addition  to  LCS  and  PLZT  technology,  a  third  alternating-pairs 
approach  to  3D  imaging  uses  vertical,  rather  than  horizontal  disparity  (Jones 
et  al.,  1984),  Two  images,  depicted  from  about  1-1,5°  visual  angle  disparity 
are  alternatively  presented.  Effective  stereo  perception  is  obtained  at 
alternation  rates  between  4  and  30  Hz.  This  technique  has  two  distinct 
advantages  over  the  alternate  frames  binocular  stereoscopic  techniques 
described  above: 

(1)  Because  both  eyes  see  the  same  image,  there  is  no  need  for 
alternating  frames  and  polarized  or  shuttered  glasses.  It  is  a  binocular 
autos tereoscopic  display. 

(2)  The  absence  of  glasses  allows  stereo  to  be  perceived  as  the  head  is 
rotated . 
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The  primary  cost  of  this  technique  is  related  to  its  poorer  image 
quality,  and  the  fact  that  the  display  produces  a  "rocking"  sensation  when 
viewed  (Hodges  &  McAllister,  1985). 

4.2.4  Active  parallax  displays .  Active  parallax  displays  capitalize  on 
the  compelling  sense  of  depth  created  by  relative  motion.  Such  a  display, 
designed  by  Suetens  et  al.  (1988)  uses  physical  movement  of  the  head,  as 
recorded  by  head  position  sensors  to  drive  the  relative  placement  of  objects 
on  the  image  screen,  thereby  presenting  the  sane  sense  of  motion  parallax  that 
would  be  created  if  a  true  3D  scene  were  examined  from  different  viewpoints. 
Suetens  et  al.,  describing  the  applications  of  this  technique  to  medical 
Imaging,  have  coupled  it  with  stereo  viewing  providing  what  they  describe  as  a 
compelling  sense  of  depth. 

However,  one  drawback  acknowledged  by  the  authors  is  that  active 
parallax  imposes  a  need  for  rapid  display  updating,  since  this  updating  must 
be  directly  tied  to  head  movement.  Hence,  unless  images  are  relatively 
simple,  the  technique  becomes  computationally  quite  Intensive.  As  a  result, 

It  is  restricted  to  wire  frame  figures.  The  authors  note  in  their  rendering 
that  hidden  line  algorithms  are  also  sacrificed.  However,  the  combined 
compellingness  of  stereopsls  and  motion  apparently  resolves  any  ambiguity, 
although  no  data  are  cited. 
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5.0  DISPLAY  APPLICATIONS 

The  discussion  in  this  section  concerning  the  implementation  of  3D 
technology  in  displays  takes  a  different  perspective  from  that  presented  in 
Sections  2  and  3,  where  the  emphasis  was  on  the  psychological  principles 
underlying  the  perception  of  depth.  The  display  designer  is  less  concerned 
about  the  psychology  of  depth  perception,  per  se,  than  about  the  need  to 
present  information  in  an  Interpretable  fashion,  capitalizing  on  depth 
perception  where  it  is  useful.  In  the  following  sections  we  consider  a 
variety  of  applications  which  have  employed  a  display  representation  of  depth 
either  to  represent  depth  itself,  or  to  simulate  another,  nondistance, 
dimension. 

In  addition  to  the  separate  categories  of  application  (e.g.,  aircraft 
display,  graphic  data  depiction),  the  conclusions  in  the  report  studied  are 
grouped  according  to  the  following  four  experimental  or  review  orientations: 

(1)  those  studies  that  have  compared  a  3D  representation  to  a  2D  counterpart, 

(2)  Chose  studies  chat  have  examined  different  facets  of  3D  display 
representation  (e.g. ,  varied  the  number  of  cues,  or  examined  the  presence  of 
distortions),  (3)  those  studies  that  have  implemented  and  evaluated  3D  display 
technology,  and  (4)  those  papers  that  have  proposed  a  3D  display  technology  or 
application.  Since  the  emphasis  of  the  current  report  is  on  empirical  data, 
studies  in  the  latter  category  will  be  included  only  to  the  extent  that  they 
have  not  yet  been  followed  by  empirical  evaluations.  Two  other  points  are 
relevant  to  the  following  review;  (1)  Some  applications  may  fit  within  more 
than  onf  category,  and  for  reference  purposes  are  discussed  in  both.  (2) 

Given  the  special  status  of  stereoscopic  displays  as  an  emerging  display 
technology,  these  displays  will  be  given  prominence  in  some  of  the  following 
sections,  where  recent  developments  have  been  more  exten-i-u. 
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5 . 1  Aircraft  Cockpit  Applications 


Because  of  the  six  degrees  of  freedom  in  Euclidian  space  which 
characterize  flight  (translation  in  x,  y,  and  z;  rotation  in  pitch,  roll,  and 
yaw),  and  because  of  the  apparent  difficulty  in  integrating  this  information 
from  traditional  2D  display  representations  of  the  cockpit  instrument  panel, 
flight  path  guidance  displays  have  provided  a  natural  domain  for  development 
and  application  of  3D  display  technology,  with  major  research  programs  found 
at  Wright-Patterson  AFB  (AFFTWAL  and  the  Air  Force  Aerospace  Medical  Research 
Laboratory  Human  Engineering  Group),  Naval  Air  Development  Center,  and  NASA 
Langley.  To  provide  a  context  for  the  following  discussion.  Figure  5.1 
presents  a  generic  prototype  of  such  a  display.  Variations  of  this  prototype 
have  served  in  most  of  the  studies  discussed  below. 

Studies  reported  in  the  following  section  have  been  driven  heavily  by 
three  forces;  (1)  The  emergence  of  the  microwave  landing  system  (MLS), 
allowing  curved  approaches  to  airports  (Remer  &  Billmann,  1987)  ,  has 
highlighted  the  concern  for  flight  displays  that  support  precise  situation 
awareness  of  the  position  along  the  flight  path  through  a  perspective  "highway 
in  the  sky."  (2)  3D  display  technology  has  also  enabled  the  creation  of 
displays  that  provide  the  combat  pilot  with  a  greater  degree  of  tactical 
awareness  (Boff  &  Calhoun,  1983).  (3)  Technical  developments  in  visual  flight 

simulation  have  continued  to  address  the  best  ways  of  making  the  pilot  aware 
of  his  orientation  and  altitude  over  the  ground,  for  both  civilian  and 
military  flight. 

5.1.1  2J2  versus  3D  comparison.  Surprisingly  few  studies  have  actually 

been  undertaken  to  provide  an  objective  comparison  of  3D  and  2D 
representations  of  the  same  Information  in  flight.  One  svich  study,  carried 
out  by  Grunwald,  Robertson,  and  Hatfield  (1981)  examined  a  3D  "highway  in  the 
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Figure  5.1.  Example  of  perspective  flight  path  display  (Wickens,  Haskell,  & 
Harte,  1989a. b).  The  predicted  aircraft  position  is  shown  as  the  smaller 
black  aircraft  symbol.  The  desired  flight  path  tunnel  recedes  into  the 
distance . 
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sky"  display,  with  preview  and  prediction,  designed  for  helicopter  approaches. 
A  prototype  using  textural  gradient  and  convergence  to  convey  flight  path 
depth,  was  compared  with  a  prototype  that  presented  the  same  predictive  and 
preview  information,  without  the  supplementary  depth  cues.  Although 
performance  on  the  former  display  was  superior,  the  difference  was  not  large, 
and  emerged  only  at  the  longest  predictive  interval  (1000  feet).  Data  were 
reported  for  only  one  subject.  Another  study  by  Wickens ,  Andre,  and  Moorman 
(in  preparation)  contrasted  a  3 -view  planar  display  with  an  outside- in 
perspective  display  to  support  navigation  across  a  simulated  airspace.  The 
3 -view  display  presented  a  forward  looking  attitude  display  indicator,  a  side 
view  vertical  situation  indicator,  and  a  top  view  map.  Performance  of 
nonpilot  subjects  was  superior  with  the  plan  view  displays,  because  of 
difficulties  that  subjects  had  resolving  depth  ambiguities  along  the  line-of- 
sight,  when  using  the  perspective  display. 

5.1.2  Ifi  predictor-preview  implementation.  In  contrast  to  the  weak 
empirical  evidence  described  above  for  an  advantage  of  3D  over  2D  when  both 
display  formats  provide  equivalent  Information,  there  is  strong  evidence  that 
a  flight  path  display  incorporating  3D  prediction  and  preview  is  superior  to 
one  without  such  Information.  Studies  by  both  Reising,  Barthelemy,  and 
Hartsock  (1989)  and  Wickens,  Haskell,  and  Harte  ( 1989a, b)  have  drawn  such  a 
conclusion  with  experiments  using  a  relatively  large  sample  of  subjects.  It 
is,  of  course.  Impossible  to  determine  if  the  advantages  offered  by  the  3D 
predictor  and  preview  information,  since  conditions  were  not  examined  in  which 
the  latter  information  was  offered  in  2D  form.  Both  studies,  however,  offered 
further  conclusions  about  the  implementation  of  3D  technology  to  be  discussed 
below. 
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5.1.3  3D  Implementatton  of  flight  displays .  Grunwald  (1984)  provides  a 
detailed  discussion  of  the  implementation  of  a  3D  perspective  display  for 
aircraft  landing  approaches,  implemented  at  NASA  Langley  Research  Center,  and 
describes  an  evaluation  based  on  four  extensively  trained  nonpilots,  and  two 
pilots.  The  display  provides  a  receding  series  of  boxes,  forming  a  flight 
path  tunnel  in  the  sky.  Two  aspects  of  his  study  should  be  noted;  (1) 

Specific  points  equidistant  along  the  tunnel  were  graphically  highlighted, 
thereby  providing  "looming"  cues  of  depth  change  proportional  to  forward 
velocity,  as  the  plane  passes  through  the  tunnel.  (2)  Grunwald  experimentally 
m.inipulated  the  presence  or  absence  of  a  3D  object  representation  of  the 
predictor  symbol.  While  this  symbol  was  located  at  a  position  in  perspective 
depth,  it  could  take  on  the  form  either  of  a  flat  2D  cross  or  a  schematic 
perspective  aircraft.  The  latter  provided  additional  information  regarding 
the  heading  and  pitch  of  the  predictor,  conveyed  by  its  orientation  in  depth. 
This  information  could  not  be  obtained  from  the  flat  cross.  Grunwald  found 
that  little  advantage  was  gained  by  this  added  3D  feature.  There  was  some 
evidence  that  its  presence  allowed  reduced  control  activity,  but  it  did  not 
improve  flight  path  tracking  accuracy. 

Adams  (1982)  describes  a  3D  perspective  display  proposed  for  commercial 
aircraft  which  incorporates  a  "follow-me"  box  (Figure  5.2).  This  is  a  3D 
perspective  representation  of  a  box,  sliding  along  the  desired  flight  path,  a 
fixed  distance  in  front  of  the  aircraft.  Because  it  is  drawn  in  30 
perspective,  the  displayed  shape  of  the  box  will  change  as  the  pilot  moves  off 
the  command  flight  path  and  gains  a  viewing  perspective  that  is  not  from 
directly  behind.  Only  in  the  latter  position  will  the  box's  3D  edges  be 
hidden,  and  will  it  be  perceived  as  a  2D  rectangle.  Simulation  flight  tests 
of  the  display  conducted  with  nine  pilots  revealed  favorable  acceptance. 


However,  pilot  comments  emphasized  the  need  for  precise  information  regarding 
the  absolute  value  of  deviations,  which  was  not  available  from  this  more 
"holistic"  display. 

Wickens,  Haskell,  and  Harte  (1989a, b)  tested  a  3D  perspective  display 
with  prediction  and  preview  for  MLS  landing  approaches  using  20  pilots.  Three 
conclusions  from  their  experiment,  bearing  on  the  implementation  of  3D  cues, 
are  relevant  to  the  current  discussion:  (1)  Airspeed  was  conveyed  by  the 
perspective  size  of  the  predictor,  as  this  predictor  was  made  to  recede 
(contract  for  high  airspeed)  or  approach  (expand  for  low  airspeed)  in  depth. 
This  feature  was  more  effective  for  airspeed  control  than  was  a  separate 
linear  indicator.  (2)  Connection  of  the  corners  of  the  preview  flight  path 
box  by  lines,  as  shown  in  Figure  5.1,  provided  a  cue  of  linear  perspective 
which  facilitated  judgments  of  orientation  from  the  flight  path.  (3) 
Incorporating  the  cue  of  interposition  by  using  hidden  line  algorithms  on  the 
flight  path  tunnel  was  particularly  useful  in  resolving  perceptual  ambiguities 
of  depth  when  the  tunnel  was  viewed  from  below. 

Nataupsky  and  Crittendon  (1988)  evaluated  a  perspective  display  for  MLS 
landing  with  and  without  stereopsls  (see  5.1.4),  using  Air  Force  pilots  as 
subjects.  Two  pictorial  representations  of  the  flight  path  were  contrasted; 
a  simple  "monorail,"  connected  by  posts  to  the  ground,  was  contrasted  with  a 
series  of  20  square  boxes  connected  to  the  ground  (like  sign  posts),  with  a 
constant  "true  size,"  which  therefore  diminished  In  display  size  with  depth. 
The  "sign  posts"  provided  relative  size  as  a  depth  cue,  along  with  more 
precise  Information  about  err^'r  tolerance  bands.  The  sign  post  representation 
upported  better  performance  In  the  particular  task  studied  by  the 
Investigators- -  the  discrete  response  to  sudden  displacements  from  the  flight 
path. 


78 


A  major  research  program  at  the  Naval  Air  Development  Center  has  focused 
on  the  development  and  flight  testing  of  another  3D  perspective  display 
concept  developed  by  Filarsky  and  Hoover  (1983)  and  Scott  (1989).  The  display 
concept  has  been  successfully  flight  tested  in  the  air  on  a  USAF/Convair  NC- 
131,  an  F-14  fighter  and  in  a  low-fidelity  helicopter  simulator  (Scott,  1989). 

Ti'e  studies  carried  out  by  Grunwald  (1984)  and  Wickens  et  al .  (1989a, b) 
have  also  examined  the  issue  of  frame-of-reference  within  the  context  of 
perspective  displays.  At  issue  is  the  extent  to  which  the  aircraft 
representation,  rather  than  the  horizon  and  tunnel,  should  move  relative  to 
the  frame  of  the  display,  The  comparative  evaluations  of  the  two  reference 
frames  by  Wickens  et  al.  supported  a  traditional  inside-out  frame,  while  the 
study  carried  out  by  Grunwald  provided  ambivalent  evidence. 

5.1.4  Stereo  enhancement .  Recent  developments  In  3D  flight  path 
displays  have  focused  on  the  incorporation  of  stereoscopic  cues.  While  a 
considerable  amount  has  been  written  which  discusses  the  potential  advantages 
of  this  technology  (e.g. ,  Boff  &  Calhoun,  1983;  Bridges  &  Reising,  1987)  there 
have  been  only  a  small  number  of  studies  that  provide  solid  empirical  data 
regarding  the  efficacy  of  such  displays  In  the  dynamic  flight  environment. 

All  have  employed  LCD  alternating  frame  technology  to  present  the  stereo 
images . 

A  basic  laboratory  Investigation  by  Kim  et  al.  (1987)  provided  an 
empirical  framework  for  considering  the  more  applied  studies  in  this  area. 
Their  study  evaluated  subjects'  (nonpilots)  tracking  performance  in  a  three- 
dimensional  volume.  Subjects  manipulated  a  joystick  to  control  a  cursor  cube. 
The  objective  was  to  track  a  target  cube,  which  moved  through  the  volume  and 
was  attached  by  a  "post"  to  the  ground.  The  investigators  found  that  stereo 
viewing  improved  tracking  performance  when  the  display  was  poor  in  visual 
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detail,  but  the  advantage  of  stereo  was  no  greater  than  the  advantage  offered 
by  a  textured  ground  surface.  Furthermore,  the  two  cues  were  not  additive,  in 
that  providing  stereo  to  the  rich  textured  display  yielded  no  further 
Improvement . 

Nataupsky  and  Crittendon  (1988)  examined  the  effect  of  stereo  viewing  in 
the  KLS  flight  path  display  described  In  section  5,1.3.  As  noted  in  that 
section,  two  formats  of  depth  information  were  employed  to  present  the  command 
flight  path,  with  the  "monorail"  being  less  effective  than  the  "sign  post" 
display,  in  the  task  employed  by  the  Investigators  (responding  to  sudden 
displacements  from  the  flight  path) .  Introduction  of  stereoscopic  viewing 
improved  response  time  with  the  less  effective  monorail  display,  but  had  no 
Influence  on  performance  with  the  more  effective  sign  post  display,  a 
conclusion  that  mirrors  the  one  drawn  by  Kim  et  al,  (1987). 

Raising  et  al.  (1989)  measured  continuous  flight  path  performance  while 
subjects  flew  a  tunnel  in  the  sky  with  prediction  and  preview  (see  also 
5.1.3).  Subjects  flew  through  both  an  easy  and  a  difficult  course  (defined  by 
the  number  and  magnitude  of  course  changes),  and  care  was  taken  to  Incorporate 
realistic  and  effective  nonstereo  depth  cues  of  relative  motion. 

Interposition,  relative  size,  linear  perspective,  and  ground  texture.  The 
Investigators  found  that  the  addition  of  stereo  viewing  provided  minimal 
advantage  over  the  nonstereo  version,  although  there  tended  to  be  a  small 
stereo  enhancement  with  the  more  difficult  flight  path. 

Way,  Hobbs,  Qualy-Whlte  and  Gilmour  (1989;  Way,  1989)  Implemented  a  full 
mission  simulation  on  a  fighter  simulator.  In  which  pilots  flew  both  air-to- 
air  and  air-to-ground  combat  scenarios.  Various  features  of  the  display  world 
could  be  presented  either  in  2D  perspective  (Martin  &  Way,  1987)  or  3D 
stereoscopic  representation.  Critical  among  these  were  a  perspective 


80 


situation  format,  presented  in  both  a  ground  and  air  mode,  a  HUD  display  of  a 
flight  path  "highway  in  the  sky,"  and  a  top-down  map,  or  horizontal  situation 
format.  Stereopsis  was  implemented  by  time  multiplexing  images  viewed  through 
stereo  glasses.  Multiple  performance  measures  were  collected  during  both 
mission  types,  both  with  and  without  stereo  viewing  features.  Consistent  with 
the  results  of  other  stereo  cockpit  .‘Studies ,  Way  et  al.  failed  to  find  any 
advantage  for  stereo  viewing  of  the  flight  path  and  tactical  situation  viewing 
in  the  full  mission  simulation.  Pilots  commented  negatively  on  the  loss  of 
display  resolution  Induced  by  the  stereo  images.  In  contrast,  the  perspective 
viewing  created  in  the  air-to-ground  situation  display  was  quite  well  received 
by  the  pilots.  Some  of  the  successful  features  of  stereo  implementation  for 
static  aspects  of  this  simulation  will  be  described  below  (Way,  1988). 

Collectively,  the  results  of  these  four  empirical  studies  lead  to  the 
general  conclusions  that  in  a  dynamic  environment,  stereoscopic  viewing  can 
sometimes  be  beneficial,  but  primarily  for  its  compensation  for  the  absence  of 
other  pictorial  and  motion  cues,  rather  than  for  any  enhancement  of  the 
effectiveness  of  well  implemented  cues  that  are  already  present.  This  finding 
is  consistent  with  the  conclusions  drawn  in  section  3,  which  indicate  that 
departures  from  the  additive  model  are  more  likely  to  be  observed  under 
dynamic  than  static  conditions.  Indeed,  one  characteristic  of  the  first  three 
studies  discussed  above  that  probably  enhanced  the  effectiveness  of  nonstereo 
cues  is  the  dynamic  property  of  the  flight  display  which  could  provide 
relative  motion  cues.  In  fact,  in  the  absence  of  motion  cues,  three 
additional  studies  described  below  have  provided  good  evidence  for  the 
effectiveness  of  stereo  in  an  aviation  environment,  although  none  of  these 
Involved  flight  path  guidance. 
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Zenyuh  et  al.  (1988)  examined  the  effectiveness  of  stereo  and  of 
relative  size  cues  in  a  tactical  situation  awareness  display.  Subjects  wtrd 
presented  with  a  )arge  number  of  displayed  aircraft  and  were  asked  co  count 
the  number  of  a  class  of  aircraft  within  a  particular  region  of  the  3D 
display.  Their  results  indicated  that  both  relative  size  cues  and  stereoscopy 
improved  the  accuracy  of  performance,  and  that  those  effects  were  relatively 
additive  and  of  equal  weight,  supporting  the  conclusions  drawn  in  Section  3. 

It  should  be  noted  that  the  total  representation  of  depth  in  their  display  was 
fairly  Impoverished,  in  contrast  to  the  typical  research  on  flight  path 
displays.  That  is,  only  relative  size  and  stereo  cues  were  available. 

In  a  paradigm  that  was  similar  to  that  used  by  Zenyuh  et  -i.  (1988),  Yeh 
and  Sllverstein  (in  preparation)  presented  subjects  with  a  schematic 
perspective  terrain,  above  which  were  presented  two  static  objects.  Subjects 
were  to  judge  which  object  was  closer  to  them,  and  which  was  higher 
altitude,  using  either  perspective  only,  or  perspective  with  stereo.  Various 
viewing  aspects  and  stimulus  configurations  were  used.  The  Inve.stigators 
found  that  stereo  provided  a  clear  enhancement.  Furthermore,  the  pattern  of 
results  was  such  that  the  greatest  stereo  enhancements  were  obtained  in  those 
conditions  (and  for  those  judgments)  for  which  perspective  viewing  was 
poorest.  These  results  are  consistent  with  those  observed  by  Kim  et  al. 

(1987). 

A  second  aviation- relevant  study  by  Way  (1988)  examined  a  pilot's 
ability  to  discriminate  the  octants  of  a  wire  frame  "sphere,"  .surroantllnc  an 
aircraft.  Tills  sphere  was  meant  to  simulate  the  functional  status  of  on-board 
sensors.  Here  also,  as  with  the  Zenyuh  et  al .  study,  nonstereo  depth  cues  for 
the  "front"  and  "back"  sides  of  the  sphere  were  impoverished,  provided  only  by 
height  in  the  visual  field  (i.e,,  the  sphere  was  viewed  as  if  from  slightly 
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above).  Under  these  circumstances,  stereo  viewing  led  to  a  substantial 
reduction  in  the  number  of  confusions  between  the  near  and  far  surfaces  of  the 
sphere . 

Way  (1988)  also  reports  a  study  in  which  stereo  was  used  to  highlight 
critical  regions  of  an  aircraft  system  diagram  by  making  these  "pop  out"  above 
the  image  plane.  He  found  that  color  coding  was  more  effective  than  stereo 
coding  in  reducing  the  time  required  to  Identify  the  highlighted  item. 

5.1.5  Ground  texture .  A  major  concern  in  the  development  of  flight 
pa.-h  displays  has  been  determining  the  appropriate  3D  cues  for  ground 
simulation  to  provide  information  for  perception  of  forward  velocity,  altitude 
and  needing  auring  approach  or  low  level  flight.  There  is  no  doubt  that  such 
informati ->n,  in  the  form  of  textural  gradient  and  expansion  is  useful  (Gibson, 
1979;  Langcwiesche ,  1944),  Tiecent  research  has  focused  on  determining  which 
particular  elements  of  ground  texture  are  most  useful,  given  the  tradeoffs  in 
computer  di.  jplay  technology  necessary  to  implement  some  of  these  various  forms 
of  texture.  These  data  have  a  direct  relationship  to  the  research  of  Cutting 
and  Millard  (1984)  discussed  in  Section  3.5,  regarding  the  necessary  depth 
cues  for  slant  perception,  although  that  research  is  not  generally  cited. 

In  the  slinulatiru  research  on  this  issue,  a  general  conclusion  seems  to 
be  that  "more  rs  better."  For  example,  Keardon  (1988)  compared  the  ability  of 
the  four  textures  shown  in  Figure  5.3,  to  support  accurate  Judgments  of 
landing  touchdown  point,  and  found  a  monotonic  Improvement  in  performance  as 
more  depth  cues  were  added,  from  left  to  right  in  the  figure.  Wolpert  (1988) 
evaluated  the  two  components  of  the  grid  texture  shown  in  the  third  panel  of 
Figure  5,3:  t’.ie  parallel  along-flight  path  lines,  which  provide  an  altitude 
cue  of  linear  perspective,  and  the  perpendicular  cross -flight  path  Tines, 
which  provide  a  cue  of  spatial  frequency  or  texture  gradient.  The  particular 
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Figure  5.3.  Four  ground  textures  used  In  landing  Judgment  study  by  Reardon 

(1988)  . 
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cues  of  altitude  provided  by  these  two  orthogonal  linesets  are  soraetiraes 
referred  to  as  splav  along  the  flight  path,  and  compression .  across  the  flight 
path.  Subjects  in  Wolpert's  simulation  attempted  to  maintain  altitude  with 
each  of  the  two  patterns,  or  with  their  combination  in  the  grid.  Data 
indicated  that  parallel  texture  and  its  derivative  cue  of  linear  perspective 
or  splay  cue  were  the  most  important  components  for  the  slant  perception 
necessary  to  achieve  altitude  control.  The  advantages  of  parallel  texture 
were  the  same,  whether  presented  alone  or  when  combined  with  perpet\dicular 
texture  so  that  a  grid  was  formed.  Compression  seemed  to  offer  little 
advantage . 

A  somewhat  different  conclusion,  regarding  this  usefulness  of  splay  was 
obtained  by  Johnson,  Tsang,  Bennett,  &  Phatek  (1987)  and  O'Donnell,  Johnson, 
and  Bennett  (1988).  Using  a  forward-looking  simulation  which  was  disturbed 
along  translational,  but  not  rotational  axes,  these  investigators  obtained 
results  indicating  that  compression  supported  altitude  control,  but  not  splay. 

It  appears  that  the  results  supporting  splay  and  those  supporting 
compression  are  not  altogether  in  conflict.  The  usefulness  of  one  versus  the 
other  source  of  information  for  control  depends  upon  the  particular 
constraints  placed  upon  the  vehicle  (e.g.,  hover  versus  forward  motion).  In 
Wolpert's  study,  when  forward  travel  was  required,  altitude  information 
conveyed  by  compression  was  degraded  by  the  flow  of  the  perpendicular  texture 
across  the  visual  field,  whereas  the  appearance  of  splay  was  unaffected  by 
forward  motion.  Hence,  it  becomes  a  perceptually  more  reliable  cue.  In 
stable  hover,  however,  when  forward  motion  is  not  Involved  (Johnson  et  al., 
1987),  compression  Is  only  affected  by  altitude,  and  it  becomes  as  effective 
as  splay,  if  not  more  so  (Flach,  personal  communication). 
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A  recent  study  by  Weinstein  (1990)  compared  a  perspective  splay  display, 
with  a  more  conventional  forward-looking  ADI  for  control  of  altitude  and 
heading  in  a  helicopter  simulation.  While  perspective  splay  did  not  provide 
better  performance,  Weinstein's  data  suggest  tnat  control  effort  was  reduced 
and  more  resources  were  freed  to  deal  with  a  concurrent  task,  when  using  this 
display . 

Kleiss,  Hubbard,  and  Curry  (1989)  examined  the  effects  of  object  detail 
and  object  density  of  computer-generated  imagery  in  simulator  altitude 
control.  Their  conclusion  is  that  such  control  was  better  supported  by  a 
larger  number  of  crudely  drawn  objects  (i.e.,  polygons)  than  a  smaller  number 
of  more  "realistic"  objects  (e.g.,  trees).  Hence,  If  computational  image 
generation  power  is  a  limiting  bottleneck,  then  such  power  would  be  be:.t 
allocated  to  creating  a  dense  but  abstract  or  schematic  set  of  ground  objects. 

Another  Important  tradeoff  in  dynamic  ground  texture  simulation  is  with 
display  update  or  frame  rates.  An  early  ^tudy  by  Wempe  and  Palmer  (1970) 
examined  the  Influence  of  varying  frame  rates  on  landing  approaches.  While 
high  update  rates  provide  greater  simulated  realism,  Uempe  and  Palmer 
concluded  that  Increasing  update  rate  above  0.3  Hz  did  not  Improve  la  idlng 
performance.  However,  it  should  be  noted  that  their  study  incorporated  only 
random  dot  ground  texture,  and  did  not  include  the  more  complex  line  textur' 
shown  to  the  right  aide  of  Figure  5.4. 

In  summary,  the  previous  studies  suggest  that  certain  kinds  of  control 
tasks  may  best  depend  on  certain  kinds  of  cues  (e.g.,  splay,  compres' ion) . 
Analysis  of  the  optical  information  necessary  to  control  can  provide  guidance 
of  the  optimal  cue  for  a  particular  control  task.  However,  across  different 
tasks,  more  information  appears  to  be  better,  and  any  textural  information, 
whether  in  the  form  of  lines,  grids,  or  spots,  should  be  regular,  not  randor 
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5.1.6  Summary  and  conclusion.  The  current  data  strongly  suggest  that 
well  implemented  3D  technology,  capitalizing  on  the  variety  of  monocular  and 
binocular  cues  available  to  human  perception,  can  be  used  to  enhance 
performance  in  aviation- related  environments.  Even  so,  however,  the  number  of 
studies  that  have  systematically  paired  3D  displays  with  corresponding  2D 
versions  containing  the  identical  information  are  few,  the  results  of  such 
studies  are  ambivalent,  and  more  research  of  this  sort  certainly  is  needed. 
This  is  particularly  true  in  light  of  the  fact  that  there  may  be  shortcomings 
of  3D  displays  inherent  in  their  ambiguity  when  presenting  distance  and 
position  Infomation.  Thus  far,  studies  which  have  examined  3D  applications 
for  flight  path  control  and  guidance  have  failed  to  Incorporate  tasks  that 
require  precise  check  reading  of  flight  parameters  (which  would  be  directly 
available  from  traditional  displays),  although  subjects  in  Adams's  (1982) 
experiment  were  explicit  in  noting  the  absence  of  such  information  from  the 
perspective  display. 

With  regard  to  the  specific  cues  necessary  to  achieve  a  good  3D 
representation,  the  current  data  remain  generally  consistent  with  the  "more  is 
better"  model  drawn  from  Section  3.  Stereoscopic  cues  appear  to  be  neither 
more  nor  less  effective  in  their  Influence  on  performance  than  other  cues, 
particularly  when  the  display  is  dynamic.  The  data  seem  to  suggest  some 
departure  from  the  linearity  of  an  additive  model  in  the  upper  range  of  cue 
numbers.  That  is,  when  a  number  of  depth  cues  (whether  mono  or  stereo)  are 
already  preseiit,  performance  is  not  improved  by  the  addition  of  more  cues, 
p*  chaps  eflectlng  a  celling  effect.  This  departure  from  the  additive  model 
seems  to  be  paiticularly  true  when  motion  is  present.  This  conclusion  does 
not  diminish  the  potential  importance  of  stereoscopy  in  flight  deck  displays. 
Its  presence  can  certainly  enhance  depth  perception  with  static  displays,  or 
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lessen  the  need  for  the  computer -intensive  imagery  necessary  to  create  other 
cues  in  a  dynamic  mode.  Finally,  the  relatively  small  number  of  studies  upon 
which  the  above  conclusions  are  drawn  does  not  begin  to  address  the  full 
problem  space.  Clearly,  a  great  deal  more  research  is  needed  in  this  area  to 
explore  the  costs  and  benefits  of  stereo  technology. 

5.2  Air  Traffic  Control  Displays 

The  air  traffic  control  environment  is  a  natural  one  for  the 
introduction  of  3D  display  technology,  because  of  the  need  for  controllers  to 
solve  problems  Involving  all  three  dimensions  of  Euclidian  space  by 
recommending  horizontal  and  vertical  maneuvers.  Hence,  it  is  surprising  that 
relatively  little  work  has  been  done  in  this  area.  Most  prominent  is  a 
program  of  research  carried  out  at  NASA  Ames  Research  Center,  which  has 
focused  on  the  Cockpit  Display  of  Traffic  Information  or  CDTI  (McGreevy  & 
Ellis,  1986;  Ellis,  McGreevy  &  Hitchcock,  1984,  1987;  Smith,  Ellis,  &  Lee, 
1934).  Using  a  systematic  approach  to  optimizing  perspective  display  design, 
Ellis  and  his  colleagues  developed  the  display  shown  in  Figure  5.4(b),  which 
presents  an  "outside- in"  view  of  the  pilot's  own  ship,  along  with  the  position 
of  other  nearby  ali craft.  Noteworthy  here  is  the  presence  of  a  number  of 
artificial  aids  or  symbolic  enhancements  to  compensate  for  the  shortcomings  of 
perspective  displays  in  precise  checkreading.  These  enhancements,  discussed 
in  Section  4.1,  include  the  "post"  upon  which  each  aircraft  stands,  the  second 
post,  in  front  of  each,  which  unambiguously  specifies  the  course  heading  along 
the  ground,  and  the  "x"  on  each  post,  which  corresponds  to  the  altitude  of  the 
pilot's  own  aircraft.  This  feature  allows  precise  determination  of  whether  an 
aircraft  is  above  or  below  the  pilot's  aircraft.  Smith,  Ellis,  and  Lee  (1984) 
carried  out  a  comparative  evaluation  of  this  display  format,  contrasted  with 
the  2D  plan  view  shewn  in  Figure  5.5(a),  In  their  simulation  experiment,  they 
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(a) 


(b) 


PLAN-VIEW  DISPLAY  PERSPECTIVE  DISPLAY 


Figure  5.4.  (a)  Plan  view  display;  (b)  Perspective  display  for  air  traffic, 

developed  by  Ellis,  McGreevy,  and  Hitchcock.  (1984). 
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found  that  the  3D  version  was  viewed  more  favorably  by  pilots.  Ellis, 

McGreevy  and  Hitchcock  (1984)  found  that  the  perspective  display  facilitated  a 
greater  use  of  vertical  corrections  in  response  to  potential  collisions. 

A  similar  prototype  was  developed  by  Bemis,  Leeds,  and  Winer  (1988)  to 
support  the  performance  of  Navy  air  intercept  controllers,  who  must  detect  an 
airborne  threat,  and  then  select  the  closest  friendly  aircraft  to  send  to 
intercept  that  threat.  Figure  5.5(a)  and  (b)  contrast  the  2D  and  3D  versions 
of  the  two  displays  that  were  compared.  Their  data  revealed  a  substantial 
advantage  for  the  perspective  display,  particularly  in  reducing  the  errors 
made  in  picking  the  closest  intercept. 

In  light  of  the  success  of  both  of  these  research  programs,  it  is 
surprising  that  more  work  has  not  been  carried  out  in  this  area,  either 
examining  conventional  ATC  displays  or  examining  the  feasibility  of  employing 
stereoscopic  displays  for  this  purpose.  The  experiment  by  Zenyuh  et  al. 

(1988)  described  in  Section  5.1.4  was  not  targeted  directly  at  air  traffic 
control,  but  did  suggest  the  usefulness  of  stereoscopic  displays  in 
identifying  the  relative  position  of  objects  in  a  3D  volume.  Williams  and 
Garcia  (1989a, b)  have  proposed  that  their  mirror-driven  volume-visualization 
display  could  serve  as  an  ideal  medium  for  ATC  displays,  as  the  controller 
could  literally  look  over  and  walk  around  a  dynamic  volume  of  the  airspace 
(see  Section  4.1).  However,  this  concept  remains  a  long  way  away  from 
implementation. 

5.3  Meteorology 

Meteorological  reseaich  and  observation  generate  large  quantities  of 
multidimensional  data.  For  each  (xyz)  spatial  coordinate  in  the  atmosphere,  a 
minimum  of  five  principal  atmospheric  variables  are  generally  of  interest  over 
time:  temperature,  pressure,  moisture,  wind  direction,  and  speed  (Grotjahn, 
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1982).  Frequently,  the  interactions  among  these  quantities  or  additional 
variables  must  also  be  considered.  It  is  not  surprising  then  that  satellite, 
radar  sensor  systems,  and  sophisticated  computer  technology  capable  of 
generating  simulations  of  complex  atmospheric  phenomenon,  have  produced 
voluminous  and  complex  geographical  data  sets.  Researchers  in  the  atmospheric 
sciences  and  those  involved  in  operational  weather  forecasting  need  to 
assimilate  these  data  in  a  timely  manner.  Traditional  two-dimensional  planar 
displays  are  inadequate  for  this  purpose.  The  need  to  preview,  analyze,  and 
present  such  large  amounts  of  information  has  prompted  the  use  of  3D  static 
and  animated  displays.  Figure  5.6  provides  prototypical  examples  of  two  such 
displays . 

The  depth  dimension  is  typically  expressed  in  any  one  or  combination  of 
the  following  formats. 

(.1)  Colors  are  arbitrarily  assigned  to  different  levels  of  data  (false 
color),  or  gray  scale  images  are  used, 

(2)  Novel  vector  plots  are  used  in  which  arrow  length,  deflection  and 
arrowhead  size  each  reflect  different  data  levels, 

(3)  Contours  are  employed  to  display  variation  in  scalar  values, 

(4)  Sequences  of  two-dimensional  contour  (or  sometimes,  perspective) 
graphs  are  stacked  along  the  plane  perpendicular  to  the  plane  of  the 
graphs , 

(5)  Perspective  views  of  a  3D  object  or  of  the  trajectories  (paths)  of 
the  data  points  in  3D  space  are  used, 

6)  True  or  artificial  stereography  is  used.  True  stereography  uses 
satellite  data  which  is  actually  recorded  in  stereo.  Artificial 
stereo  constructs  pairs  of  stereoscopic  pictures  in  part  from 


91 


2 


Peripccttvc  view  of  up*  and  downdraftf,  tJ08  FST  2  March  1978.  Contour  turface 
encloses  the  refion  of  updrafts  >4  m/s.  Lifht  contour  surface  encl6ses  the  repon  of  downdrafts 
>4  m/s.  Vertical  scale  has  been  eufferated  by  a  factor  of  4.  From  Moninfer  (W80a). 


M  ottrmyQt  nomm*** 


ThrM'dlmansional  perspective  display  of  storm  rtfleccivity  (li|hc  rmts.  dO(Z)).  up* 
drafe  (shaded  areas  ^20  m/t  interior  unshaded  ^40  m/i).  hypothetical  embryo  trajectories 
(before  Hi),  and  computed  hailstone  trajectory  (after  Hi).  H|H|  denotes  region  of  m.*  mum 
hail  growth  (after  Nelson,  1980). 


Figure  5.6:  Two  examples  of  perspective  meteorological  dlaplaya .  From 
Monlnger  6  Nelson,  1980. 
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computer-;;*  It  .  ated  Images.  The  third  dimension  (height)  represents 
any  scalat  quantity. 

Applications  using  the  latter  two  formats  (which  employ  display 
representations  of  depth)  will  be  discussed  below.  Note  that  evaluations  of 
these  display  techniques  in  the  meteorological  field  have  been  purely  informal 
and  subjective. 

5.3.1  Perspective .  Hasler,  Pierce,  Morris  and  Dodge  (1985)  applied  a 
perspective  technique  to  display  several  kinds  of  meteorological  image  data. 
The  procedure  combined  information  from  a  visible  and  infrared  image  pair, 
obtained  by  satellite  (an  artificial  illumination  image  may  be  substituted  for 
the  visible  image),  and  combined  the  input  into  a  single  perspective  image. 

The  infrared  image  provided  information  necessary  to  construct  the  3D  surface 
of  the  final  image  (a  cloud)  while  the  visible  image  supplied  the  coloration 
or  shading  information  necessary  for  texturing.  This  shaded  perspective 
display  was  found  to  be  superior  to  a  wire- frame  perspective  display,  on  the 
basis  of  subjective  Judgments  of  realism. 

By  systematically  selecting  and  combining  a  series  of  "eye-points," 
defining  the  viewing  azimuth  and  elevation  angle,  and  "viewpoints"  (location 
of  the  point  being  viewed)  and  constructing  the  image  as  it  would  be  seen 
looking  from  the  eye  point  towards  the  view  point,  a  movie  was  produced  which 
employed  motion  parallax  as  an  additional  depth  cue.  An  impression  of  "flying 
over"  or  "moving  through"  the  display  was  reported  by  viewers.  Additional 
depth  cues  were  given  by  using  shadows  and  perspective  when  incorporating 
annotation  representing  latitude  and  longitude  into  the  image.  The  annotation 
appeared  to  float  over  the  clouds. 

Kelley,  Russo,  Eyton,  and  Carlson  (1988)  also  developed  a  perspective 
technique.  Model  Output  Enhancement  (MOE) ,  which  generates  mesoscale  weather 
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forecasts.  The  resulting  display  uses  both  color-class  planar  maps  and  color- 
class  maps  overlaid  on  perspective  plots  of  terrain.  It  was  felt  that  the  use 
of  3D  perspective  provided  the  viewer  with  greater  "spatial  feeling"  of  the 
mesoscale  temperature  field  and  aided  in  the  interpretation  of  the  data. 

At  least  two  problems  have  been  noted  when  perspective  alone  is  used  as 
a  depth  cue  (Grotjahn  &  Chervin,  1984).  There  can  be  a  "line  of  sight" 
problem,  that  is,  ambiguity  as  to  where  in  a  2D  viewing  plane  a  data  point 
actually  lies.  This  problem  can  be  attenuated  by  (1)  using  a  shadow  cast 
(projection)  on  the  sides  of  a  volume  box  enclosing  the  data  points  (although 
this  adds  to  display  clutter) ,  (2)  varying  the  size  of  the  symbol  representing 
the  data  point  depending  on  its  "discance"  from  the  viewer,  (3)  employing 
stereoscopy,  or  (4)  allowing  rotation  of  the  object,  thus  enhancing  the 
viewer's  ability  to  construct  a  3D  mental  image  (Grotjahn,  1982). 

Grotjahn  also  observed  that  viewer  disorientation  is  possible  when 
viewing  a  perspective  object  from  different  angles  unless  the  different  angles 
are  shown  simultaneously  (which  again  increases  display  clutter). 

5.3.2  Stereo  enhancement.  Hasler,  Desjardins,  and  Negri  (1981) 
describe  a  general  technique  for  displaying  multidimensional  data  sets  using 
artificial  stereo.  As  discussed  in  Section  4.2,  one  way  of  generating  an 
artificial  stereogram  consists  of  an  original  image  and  a  second  computer- 
generated  image  produced  by  shifting  the  pixels  in  the  original  image  to  the 
right.  The  computer- generated  image  is  colored  blue -green  and  superimposed 
over  the  original  image  which  is  colored  red.  The  user  views  the  image 
display  with  red/green  anaglyphic  glasses  which  direct  each  image  of  the  pair 
to  the  right  and  left  eyes,  respectively,  creating  a  stereo  image.  Hasler  et 
al.  report  that  stereo  presentation  allows  the  viewer  to  more  easily  and 
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intuitively  assimilate  the  information  contained  in  the  image  (although  this 
was  not  empirically  evaluated) . 

Papathomas  et  al .  (1987)  demonstrated  that  anaglyphic  stereo  animation 
can  portray  large  four-dimensional  (space  and  time)  data  sets.  They 
transformed  the  numerical  data  output  of  a  simulation  program  which  models 
time  evolution  of  weather  episodes  into  stereo  animation  sequences  of  storm 
clouds.  In  rendering  the  graphics  of  the  clouds,  Papathomas  et  al .  found  that 
a  "particle  system"  format,  in  which  all  points  in  the  volume  of  a  cloud  are 
illuminated,  was  superior  to  a  wireframe  or  surface  rendering  of  the  object  (a 
cloud).  The  particle  system  is  suited  to  the  depiction  of  fuzzy  objects. 
Superiority  in  large  part  was  judged  on  the  basis  of  subjective  scene  realism. 
Papathomas  et  al.  also  found  that  graphic  realism  could  be  farther  enhanced  by 
randomly  "jittering"  the  local  random-dot  distribution  relative  to  the 
background,  that  is,  movement  of  cloud  particles  aided  in  depth  perception. 

Hasler  (1981)  describes  the  techniques  and  applications  of  stereographic 
images  derived  from  actual  stereo  observations  (from  Geosynchronous 
Satellites).  Unlike  artificial  stereo  techniques,  a  major  advantage  of  true 
stereographic  images  is  that  they  can  be  used  to  make  high-precision  cloud 
height  measurements.  Determination  of  height  Involves  parallax  measurement  of 
the  image.  The  location  of  a  feature  on  each  of  the  Images  of  a  stereo  pair 
is  measured  by  using  one  of  three  techniques,  all  of  which  involve  shifting 
one  image  until  parallax  for  a  feature  is  eliminated.  Height  calculations  can 
be  derived  from  the  amount  of  this  parallax  shift. 

While  true  stereography  is  more  useful  quantitatively,  artificial  stereo 
viewing  provides  a  higher  visual  quality  and  greater  stereoscopic  view. 

In  summary,  it  appears  that  the  development  and  evaluatlo.i  of  3D 
meteorological  displays  is  in  a  far  earlier  stage  than  the  corresponding 
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development  for  flight  displays,  despite  the  potential  benefits  that  this 
technology  has  to  offer. 

5 . 4  Teleoperator  and  Robotics 

Teleoperation  Involves  the  remote  control  of  machines.  Typically,  a 
human  operator  contributes  his/her  perceptual-motor  and  cognitive  skills  to  a 
manipulation  task  via  a  machine  Interface  located  at  some  distance  from  a 
hazardous  or  Inaccessible  environment.  Visual  or  graphic  displays  are 
necessary  to  provide  the  human  operator  with  unambiguous  Information  about  the 
work  environment.  Much  of  this  Information  Is  three-dimensional,  and  it  goes 
without  saying  that  some  form  of  3D  representation  Is  valuable.  The 
Identification  of  the  specific  display  cues  affecting  spatial  perception, 
then,  Is  essential  to  facilitate  performance.  Many  of  the  studies  conducted 
In  the  context  of  teleoperation  have  focused  on  the  relative  utility  of 
monoscoplc  and  stereoscopic  displays  and  on  visual  enhancements.  These  are 
described  below. 

5.4.1  Monoscoplc  v.  stereoscopic  displays .  An  early  study  by  Pepper  and 
Cole  (1978)  suggested  that  stereoscopic  viewing  systems  did  not  significantly 
contribute  to  successful  remote  undersea  manipulation.  This  was  surprising 
given  the  many  direct -viewing  studies  where  the  superiority  of  binocular 
performance  over  monocularity  had  been  demonstrated.  Pepper  et  al.  (1981) 
explored  the  possibility  that  manipulation  performance  under  mono  and  stereo 
viewing  is  a  function  of  the  Interplay  of  a  number  of  factors;  environmental 
visibility  conditions,  task  type,  and  operator  experience.  Three  exper iment.s 
were  conducted  to  assess  relative  performance  of  binocular  disparity  cues  over 
monocular  cues  across  different  task  types,  experience  levels  and  visibility 
conditions . 
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A  laboratory  peg- task  was  used  In  the  first  experiment.  This  task  was 
comparable  to  such  real  world  task.s  as  drilling,  tapping,  threading,  and 
connecting,  and  primarily  involved  alignment  in  the  x  and  y  (horizontal  and 
vertical,  respectively)  planes  and  rotational  movement.  In  the  peg-task, 
monocular  cues  were  critical  for  object  recognition  and  spatial  location  while 
stereo  cues  were  considered  less  relevant.  Highly  practiced  subjects  were 
used.  Performance  times  were  examined  under  three  levels  of  visibility 
(clear,  moderate,  and  stereo).  Even  though  experimental  design  was  b'ased 
against  the  stereo  and  severe  visibility  conditions,  performance  time  was 
found  to  be  facilitated  by  the  stereo  display. 

In  the  second  experiment  the  peg-task  and  three  levels  of  visibility 
were  used  again;  however,  this  time  subjects  were  used  who  had  some  experience 
with  teleoperation  but  were  unfamiliar  (unpracticed)  with  the  specific  task  at 
hand.  Nonvisual  factors  were  minimized  while  monocular  depth  cues  were 
maximized.  The  mono-stereo  differences  were  not  found  to  be  significant 
statistically,  although  this  was  not  surprising  since  the  experimental  design 
emphasized  the  use  of  monocular  depth  cues. 

A  messenger- line  feeding  task  (comparable  to  such  tasks  as  line 
attachment,  sample  gathering,  and  simple  salvage  tasks)  was  employed  in  the 
third  experiment.  This  type  of  task  requires  alignment  in  the  x,  y,  and  z 
planes  in  a  visual  scene  containing  many  conflicting  depth  cues.  Under  all 
levels  of  visibility,  stereo  performance  was  found  to  be  far  superior  in  this 
type  of  remote  manipulation  task  due  to  the  reduced  level  of  available 
monocular  cues.  Further,  stereo  TV  was  found  to  be  degraded  less  by  poor 
visibility  than  was  mono  TV. 

This  series  of  experiments  demonstrated  that  the  level  of  Improvement 
due  to  stereo  TV  is  dependent  upon  the  complex  interaction  of  visibility. 
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task,  and  learning  factors.  It  was  shown  that  .'■.tereo  TV  displays  are  superior 
to  mono  under  most  of  the  conditions  tested.  As  scene  complexity  and  object 
ambiguity  increased,  the  advantage  of  the  stereo  display  became  more 
pronounced . 

Kim,  Takeda,  and  Stark  (1988)  quantitatively  assessed  the  utility  of 
superimposing  visual  enhancements  onto  a  flat  video  screen  to  assist  operators 
in  performing  telemanipulation  tasks.  A  teleoperator  simulator  with  a  two- 
degrees-of' freedom  joystick  and  five-degrees-of-freedora  manipulator  wa^  used. 
In  the  first  of  two  experiments,  subjects  performed  pick-and-place  tasks  under 
three  visual  conditions:  direct  view,  raonoscopic  TV  view,  and  monoscopic  TV 
view  with  visual  enhancements.  The  objects  to  be  picked  up  and  the  boxes  in 
which  they  were  placed  were  restricted  to  the  plane  of  the  robot  base.  Visual 
enhancements  included  a  vertical  reference  line  (Indicating  the  vertical 
height  of  a  point  from  the  base  plane;  see  also  Section  4.1),  a  reference 
grid,  and  a  stick  figure  model  of  the  manipulator  gripper  and  its  projection. 
In  the  second  experiment,  pick-and-place  tasks  were  performed  under  five 
visual  conditions;  direct  view,  two  adjacent  views,  visually  enhanced  TV 
view,  two  perpendicular  TV  views,  and  a  stereoscopic  TV  view  (a  helmet-mounted 
display  was  used).  Objects  to  be  picked  up  were  arbitrarily  positioned. 
Practiced  subjects  were  used  in  both  experiments.  Time  to  task  completion  and 
task  success  rate  were  used  as  performance  measures. 

Direct  view  provided  best  performance  overall.  Superimposing  the 
enhancements  was  found  to  greatly  assist  human  performance  of  telemar.ipulation 
tasks,  relative  to  the  mono  TV  display  alone,  when  the  objects  were  all 
located  on  the  robot  base  plane.  Visual  enhancements  also  benefited 
performance  relative  to  the  mono  TV  when  the  objects  were  randomly  positioned; 
however,  the  accrued  advantage  was  not  as  pronounced  or  reliable.  It  was 
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suggested  that  this  was  because  the  visual  enhancements  did  not  "explicitly 
indicate  object  orientation."  The  results  further  showed  that  when  subjects 
were  provided  with  only  the  raonoscopic  view,  occlusion  was  the  potent  cue  used 
to  determine  location  of  the  manipulator  gripper  relative  to  the  target 
object . 

The  overall  conclusion  in  this  set  of  experiments  was  that  on- the-screen 
visual  enhancements  greatly  facilitate  task  performance  when  objects  are 
positioned  along  the  robot  plane.  When  objects  are  randomly  positioned, 
additional  research  is  needed  to  determine  how  more  reliable  performance  may 
be  achieved. 

Kim  et  al.  (1987)  evaluated  monoscopic  and  stereoscopic  graphic  displays 
wit?  and  without  grid  or  reference  lines  by  employing  a  three-axis  manual 
tracking  task.  Three  perspective  display  parameters  were  varied;  elevation 
angle,  azimuth  angle,  and  object  distance.  RooC-mean-square  (RMS)  tracking 
error  was  used  as  the  quantitative  performance  measure. 

It  was  shown  that  the  grid  did  not  Improve  tracking  performance.  The 
reference  line  served  as  an  important  depth  cue  and  greatly  facilitated 
performance  for  mono-perspective  displays.  The  stereoscopic  display  pi^rmitted 
lowest  RMS  tracking  error  over  all  visual  conditions,  even  when  the  grid  or 
reference  lines  were  not  present.  Further,  at  extreme  viewing  elevation 
angles,  the  stereoscopic  display  provided  good  performance,  suggesting  that  it 
was  robust  against  elevation  angle. 

Similar  results  were  obtained  in  the  study  car:.led  out  by  Kim,  Tendick, 
and  Stark  (1987)  discussed  in  Section  3.1.  Stereoscopic  viewing  benefited  a 
3D  tracking  task  unless  the  task  was  augmented  with  symbolic  enhancements  (a 
post)  and  a  perspective  grid. 

5.4.2  Stereo  viewing  implement at ion .  A  discussion  of  issues  related  to 
positioning  of  stereo  cameras  fo^  teleoperation  was  presented  in  Section 
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A. 2. 3.  A  general  conclusion,  which  will  be  reiterated  here,  is  that 
hyperstereo  displays,  in  which  the  convergence  angle  between  two  imaging 
cameras  is  greater  than  the  angle  formed  by  the  two  eyeballs,  provides  an 
advantage  for  many  teleoperator  manipulating  tasks  (Diner  &  von  Sydow,  1988). 

5.4.3  Summary  and  conclus ions .  For  some  simple  teleoperation  tasks, 
monocular  depth  cues,  used  in  conjunction  with  the  operator's  cognitive  depth 
cues  (e.g.,  derived  from  experience)  may  be  sufficient  for  successfu? 
performance.  However,  with  increased  task  complexity,  these  cues  may  be 
unavailable.  In  some  situations,  providing  visual  enhancements,  such  as 
reference  lines,  and  adequate  perspective  parameters,  may  be  adequate  to 
compensate  for  the  inadequacies  of  a  monoscopic  display.  Stereoscopic 
displays,  however,  provide  binocular  depth  cues  which  significantly  enhance 
performance,  particularly  when  visual  enhancement  cues  are  not  presented. 

5 . 5  Other  3 • Dimensional  Graphics 

A  number  of  other  potential  applications  of  3D  displays  have  not  been 
described  above.  These  roughly  fall  into  two  categories;  static  graphs  that 
allow  operators  or  scientists  to  inspect  data,  and  dynamic  or  Interactive 
graphics  that  allow  users  to  manipulate  and  explore  visual  information 
represented  in  three  dimensions.  While  many  examples  of  3D  graphics  of  bot)- 
kinds  exist,  very  few  have  been  subjected  to  empirical  evaluation  and 
experimental  manipulation, 

5.5.1  Static  graphs .  In  spite  of  the  increasing  popularity  of  3D 
graphs  to  represent  data  with  three  or  more  variables,  of  the  sort  shown  in 
Figure  5.7,  there  seems  to  be  almost  a  total  absence  of  empirical  data  bearing 
on  its  effectiveness.  For  example  Tufte's  (1983)  classic  treatise  on  the 
graphic  display  of  data  contains  only  one  example  of  such  a  graph  (the  one 
shown  in  Figure  5.7),  out  of  over  200  figures.  This  absence  is  unfortunate 
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Democrat  Independent  Republican 
Part/  Identification 


Figure  5.7,  Example  of  3  dimensional  graphic  display.  Philip  E,  Converse. 
"Religion  and  Politics:  The  1960  Election,"  in  Angus  Campbell,  Philip  E. 
Converse,  Warren  E.  Miller,  and  Donald  E.  Stokes,  Elections  and  the  Political 
Order  (New  York,  1966),  102-103.  Reprinted  from  E.R.  Tufte  (1983).  The 
Visual  Display  of  Quant i tat ivc  Information .  Cheshire,  CT  :  Graphics  Press. 
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because  graphs  of  this  sort  have  applicability  not  only  for  data 
representation,  but  also  potentially  in  airborne  applications,  to  represent 
operating  "envelopes"  or  performance  parameter  limits  that  depend  upon  three 
or  more  variables  (e.g.,  airspeed,  altitude,  and  turn  radius).  One  exception 
is  an  experiment  by  Jensen  and  Anderson  (1987),  who  compared  correlational 
scatter  plots  rendered  by  dots  (Figure  5.8(b))  or  by  3D  "mountains,"  whose 
height  at  any  one  point  is  proportional  to  the  density  of  the  points  (Figure 
5.8(a)).  Subjects'  judgments  of  the  degree  of  relationship  between  the  x  and 
y  variable  were  actually  more  accurate  with  the  scatter  plots  than  with  the  3D 
rendering.  A  second  piece  of  empirical  data  is  provided  by  an  experiment  by 
Liu  (1989)  who  compared  scientists'  ability  to  understand  the  clustering  of 
data,  in  a  table  of  associations,  when  strength  of  association  was  either 
coded  by  color,  or  by  the  height  of  a  third  dimension  above  the  N  x  N  matrix. 
Liu  found  that  3D  coding  produced  slightly  faster  but  considerably  less 
accurate  performance.  Collectively,  neither  of  the  two  .-studies  provides 
consistent  evidence  for  an  advantage  of  3D  tenderings  of  data  relations. 

The  absence  of  more  empirical  research  in  this  area  forces  us  only  to 
outline  a  number  of  salient  Issues  that  should  be  considered  as  such  graphs 
are  constructed.  Among  these  are. 

(1)  For  graphs  that  depict  functional  relations  of  the  form  y  -  f(x,z) 
(or  experimental  data  In  which  y  Is  a  dependent  variable  affected  by 
X  and  z) ,  how  are  such  data  assigned  to  axes  to  allow  best 
interpretation?  Conventional  rendering  typically  assigns  x  and  z  to 
a  horizontal  surface  and  y  to  a  vertical  one,  as  shown  In  Figure 
5.7.  What  assignment  should  be  made  If  x  Is  a  single  Independent 
variable,  and  y,z  are  bivariate  dependent  variables? 
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(a)  3D  rendering,  (b) 
data  by  Jensen  et  al. 


2D  rendering  of  correlational 
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(2)  Using  3D  graphics,  should  protrusions  representing  the  separate  data 
points  be  connected  to  form  a  surface  as  in  (Figs.  5.7  and  5.8(a)) 
or  remain  separate?  While  experimental  data  are  not  available  on 
the  point,  it  is  apparent  that  the  lines  created  by  those 
connections  do  add  important  visual  features  that  reflect  the 
surface  shape,  and  so  are  probably  advisable. 

(3)  Considering  the  same  issues  addressed  by  McGreevy,  Ratzlsff,  &  Ellis 
(1986)  in  Section  4.1,  what  should  be  the  optimum  azimuth  and 
elevation  viewing  angle  which  will  minimize  perceptual  distortions? 

(4)  As  We  noted  in  Section  1,  3D  displays  can  incur  biases  and  ambiguity 
regarding  the  precise  estimation  of  distance  at  different  depth 
planes.  It  is  clear,  for  example,  using  a  parallel  projection,  such 
as  that  shown  in  Figure  5.7  that  bars  of  equal  physical  height 
(i.e,,  measured  in  screen  pixels)  will  not  yield  equal  perceived 
height.  Rather,  size -distance  invariance  computation  will 
exaggerate  the  perceived  height  of  the  more  distant  bar.  Ac  the 
same  time,  use  of  perspective  geometry  which  would  compensate  for 
perceived  distance  would  not  take  into  account  the  "cues  for 
flatness"  that  would  lead  .'-he  observer  to  perceive  less  deptli  thaii 
is  really  the  case.  The  point  here  is  that  the  amount  of  visual 
distortion  created  (and  therefore  compensation  needed)  by  size- 
distance  invariance  is  not  well  established.  Where  absolute  height 
measurements  on  the  ordinate  are  required,  these  should  be  augmented 
by  tick  marks  on  the  bars. 

(5)  There  is  reason  to  believe  that  3D  representations  of  data  are  not 
of  value  unless  they  convey  a  pictorial  relation  between  variables 
that  cannot  easily  be  discerned  by  two  2D  "slices."  '.vpically  any 
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additive  effects  or  linear  interactions  can  easily  be  understood  by 
two  -  ?.D  graphs.  Where  interactions  involve  higher-order  terms  (see 
for  example  Figure  5.7),  the  value  of  the  3D  representation  grows 
accordingly . 

5.5.2  Dynamic  interactive  graphics .  Two  general  categories  of 
applications  are  found  in  dynamic  computer-gei.erated  displays.  The  first  of 
these  actually  involves  a  set  of  applications  to  real-time  computer-based 
graphics  for  such  applications  as  computer-aided  design  (CAD),  computer-aided 
manv  .icturing  (CAM),  and  interacting  with  data  bases.  For  example,  Barfield 
Sandford,  and  Foley  (1988)  have  considered  the  advantage  of  different  surface 
representations  in  facilitating  the  3D  mental  rotation  of  CAD  images.  They 
found  that  shaded  surfaces  allowed  faster  manipulation  than  did  wire -frame 
surfaces.  Chen,  Mountford,  and  Sellen  (1988)  have  explored  different  cursor 
positioning  devices  for  manipulating  3D  images,  and  found  that  there  i.'-  a 
compatibility  between  the  natural  rotation  of  the  image  in  three  dimensions, 
and  a  spherical  cursor  that  has  a  similar  3D  rotational  capability  (relative 
to  2D  and  ID  manipulanda) .  Huggins  and  Getty  (1984)  examined  issues  related 
to  control -  display  compatibility  when  using  the  raultiplanar  "space  graph" 
display  described  in  Section  4.3.2.  While  numt  •'ous  other  examples  of  3D 
control-display  ensembles  for  CAD/CAM-llke  operations  exist,  these  are 
typically  void  of  valid  human  performance  data. 

Beaton,  DeHoff,  Weiman,  and  Hildebrandt  (1987)  also  evaluated  cursor- 
position  devices  for  a  3D  display  workstation.  Operator  performance  on  a 
cursor  positioning  task  was  compared  using  three  types  of  input  devices  and 
two  liiodas  to  displaying  depth  information,  a  linear  perspective  display  and  a 
time -multi ; J exed  stereoscopic  display.  The  three  input  devices  used  were  a 
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trackball,  a  mouse,  and  a  3D  thumbwheel.  The  trackball  provided  unrestricted 
cursor  movement  along  the  three  axes  of  the  workspace  (free-space  movement). 
The  mouse  allowed  for  movements  in  either  the  xy,  xz ,  or  yz  display  planes 
(plane -oriented  movement)  while  the  thumbwheel  provided  for  vector-oriented 
movements,  or  movement  through  separate  cursor  control  for  each  axis.  The 
vector-oriented  device  was  found  to  provide  the  greatest  positioning  accuracy. 
Rapid  cursor  movements  in  the  }  •  plane  (depth)  were  found  to  be  more  difficult 
than  were  movements  in  the  more  conventional  xy  plane. 

Beaton  &  Weiman  (1988)  extended  the  above  study  evaluating  cursor - 
positioning  to  include  two  additional  vector-oriented  movement  devices:  a  set 
of  planar  thumbwheels  ana  a  slider  device.  Again,  it  was  found  that  the 
vector-oriented  input  devices  provide  more  accurate  and  faster  3D  cursor 
positioning  than  either  plane-oriented  or  free-space  input  devices.  These 
results  were  attributed  to  the  observation  that  inadvertent  motions  of  highly 
coupled  input  devices  (i.e.,  free-space  and  plane-oriented)  produce 
undesirable  movements  of  the  cursor.  Unlike  the  situation  with  a  2D  display 
system,  all  motions  of  such  input  devices  correspond  to  displacement  in  a  3D 
workspace . 

In  addition,  lower  cursor  positioning  errors  were  consistently 
associated  with  the  stereoscopic  display  mode  for  all  input  devices  except  the 
planar  thumb  .-  eel  (in  which  errors  across  display  mode  were  equally  low)  . 
However,  it  was  noted  that  cursor-positioning  time  was  similar  across  display 
mode,  suggesting  that  users  were  not  aware  of  their  cursor-positioning  errors. 

Scientific  visualization,  the  second  area  of  dynamic  interactive 
graphic-;,  an  emerging  field  that  also  capitalizes  on  dynamic  coraputer- 
generated  3D  representation  (Alexander  &  Winar.sky,  1989).  Here  the  interest 
Is  in  creating  a  3D  r  ndering  of  a  phenomenon  of  interest  to  the  scientists. 
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which  they  may  interactively  "explore."  Such  phenomena  may  involve  the 
dynamics  of  storm  systems,  the  behavior  of  the  upper  atmosphere  in  reflecting 
or  absorbing  radiation,  the  structure  of  complex  molecules,  the  fracturing  of 
metals  under  stress,  or  abstract  mathematical  relations.  Figure  5.9  presents 
the  static  rendering  of  a  severe  thunderstorm  which  has  assisted  one 
scientific  team  in  understanding  the  airflow  with  such  a  system  (Wtlheliison  et 
al. ,  1989)  . 

Generally  such  renderings  in  scientific  visualization  are  based  upon 
intuitions  on  the  part  of  the  graphic  artist  working  with  the  scientist,  and 
are  not  yet  based  upon  a  set  of  empirically  based  principles.  In  an 
interesting  convergence  of  technologies,  however,  the  objective  of 
visualization  is  to  create  a  virtual  world  of  the  pheromenon  ot  interest  which 
the  scientists  can  explore  and  through  which  they  may  navigate.  The  objective 
of  the  aviation  display  designer  is,  increasingly.  tf<  capture  the  spatial 
world  through  which  the  pilot  must  navigate  in  a  virtual  display  like  that 
shown  in  Figure  5.1.  Thus,  the  objectives  of  these  two  technologies  are 
actually  quite  close,  and  it  is  likely  that  design  lessons  learned  in  each  can 
be  exploited  in  the  development  of  the  others. 
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Stntic  rendering,  nf  clynamir  viKua  1  izat Ion  proprnm  portrnyinp.  the  evolution  of  a  r.overo 
rm  ( [  rom  Wi  1  helmson  ct  al.,  1989).  Courtesy  of  N.ntional  Center  for  Supei  (  omput  i  nji  App  1  i  cn  t  i  (mis  . 


6.0  CONCLUSIONS 


There  is  little  doubt  that  3D  renderings,  if  carefully  constructed,  can 
provide  a  "natural"  viewing  of  a  variety  of  environments,  which  is 
aesthetically  pleasing.  Furthermore,  the  literature  reviewed  in  the  preceding 
pages  provides  many  examples  of  domains  in  which  such  renderings  have  been 
shown  to  be  useful  when  compared  with  will  constructed  2D  renderings. 

Examples  of  this  usefulness  will  proliferate  as  computer  graphics  technology 
continues  to  improve  in  speed  and  resolution,  and  as  the  fledgling  industry  of 
stereo  viewing  devices  begins  to  grow.  Stereoscopic  viewing,  which  is 
sometimes  considered  to  be  "true”  31.,  often  provides  an  advantage  over  non¬ 
stereo,  although  the  advantage  is  not  necessarily  more  pronounced  than  that 
offered  by  other  salient  cues  related  to  motion  parallax  and  occlusion.  The 
review  has  also  highlighted  certain  areas  where  neither  3D  in  general  nor 
stereo  in  particular  has  proven  advantageous,  particularly  in  dynamic  flight 
deck  displays.  An  encouraging  aspect  of  the  research  reviewed  is  the  extent 
to  which  general  findings  from  basic  research,  related  to  cue  dominance  and 
the  Interaction  of  static  and  dynamic  cues,  are  substantiated  in  more  applied 
contexts . 

One  conclusion  drawn  from  the  review  is  the  obvious  one  that  more 
empirical  data  are  needed  on  the  interaction  between  multiple  cues,  and 
between  other  variables,  in  complex  environments.  Research  in  applied 
contexts  needs  to  be  more  clearly  established,  for  example,  the  difference 
between  depth  cues  used  for  Judging  object  identification,  object  location, 
and  surface  slant.  It  needs  to  establish  more  clearly  how  the  utility  of 
different  cues  is  modulated  by  the  transition  from  static  to  dynamic  displays, 
as  frame  rate  is  increased.  And  a  better  understanding  of  the  difference 
between  holistic  Judgments  of  general  location  and  analytic  judgments  of 
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specific  distance  must  be  sought.  How  can  technology  combine  to  optimize  both 
of  these?  These  questions  require  more  empirical  data  before  designers  can 
make  informed  choices  about  the  simplicity  (or  complexity)  of  technology 
necessary  to  create  3D  renderings  for  a  given  application  purpose. 

Finally,  we  close  with  a  plea,  which  is  appropriate  for  the  choice  of 
any  new  dl.splay  technology.  Introduction  of  that  technology  for  a  specific 
purpose  should  be  preceded  by  a  careful  analysis  of  the  users'  information 
needs,  and  the  general  characteristics  of  that  information.  Is  check  reading 
needed?  Will  distortions  lead  to  serious  errors?  Is  the  information  dynamic? 
Careful  consideration  of  these  questions  should  ensure  that  the  final  display 
product  will  be  well  received. 
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